research-article

A Biological Population Threshold Coding with Robust Feature Extraction and Neuronal Jitter for SNN-based Speech Recognition

Authors:

Aili WangAuthors Info & Claims

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence

Pages 459 - 465

https://doi.org/10.1145/3594315.3594357

Published: 02 August 2023 Publication History

Abstract

The neuronal dynamics of brain-inspired spiking neural networks (SNNs) make them more suitable for processing dynamic signals. In SNN, neurons interact via discrete spikes. Neuronal coding is crucial to the advancement of neuromorphic computing. In the field of temporal coding, the population threshold coding (PTC) which uses multiple neurons to encode the trajectory of a time-varying signal attracts lots of research attention. It features noise robustness and spike sparsity. In this paper, we (1) evaluate the number of threshold levels and the number of filter banks in the PTC; (2) compare the Mel filter bank, the Gammatone filter bank, and the mix of two-based PTC; and (3) apply different levels of neuronal jitter to the encoding process using speech (TIDIGTS) and sound (RWCP) datasets. The classifications are performed using two types of classifiers: biologically plausible supervised Tempotron learning rule and backpropagation (BP)-based SNN learning rule. Our findings indicate that (1) the appropriate threshold resolution and number of filter banks are dependent on the datasets, and (2) PTC is robust to cochlear filter bank-based feature extractions and neuronal jitter.

References

[1]

Jacob N Allen, Hoda S Abdel-Aty-Zohdy, and Robert L Ewing. 2009. Cognitive processing using spiking neural networks. In Proceedings of the IEEE 2009 National Aerospace & Electronics Conference (NAECON). IEEE, 56–64.

[2]

Hung Tat Chen, Kwan Ting Ng, Amine Bermak, Man Kay Law, and Dominique Martinez. 2011. Spike latency coding in biologically inspired microelectronic nose. IEEE transactions on biomedical circuits and systems 5, 2 (2011), 160–168.

[3]

Peter Dayan and Laurence F Abbott. 2005. Theoretical neuroscience: computational and mathematical modeling of neural systems. MIT press.

[4]

Isabel Dean, Nicol S Harper, and David McAlpine. 2005. Neural population coding of sound level adapts to stimulus statistics. Nature neuroscience 8, 12 (2005), 1684–1689.

[5]

Rong Z Gan, Brian P Reeves, and Xuelin Wang. 2007. Modeling of sound transmission from ear canal to cochlea. Annals of biomedical engineering 35, 12 (2007), 2180–2195.

[6]

Charles D Gilbert and Torsten N Wiesel. 1992. Receptive field dynamics in adult primary visual cortex. Nature 356, 6365 (1992), 150–152.

[7]

Robert Gütig and Haim Sompolinsky. 2006. The tempotron: a neuron that learns spike timing–based decisions. Nature neuroscience 9, 3 (2006), 420–428.

[8]

Robert Gütig and Haim Sompolinsky. 2009. Time-warp–invariant neuronal processing. PLoS biology 7, 7 (2009), e1000141.

[9]

Peter Heil. 2004. First-spike latency of auditory neurons revisited. Current opinion in neurobiology 14, 4 (2004), 461–467.

[10]

Volker Hohmann. 2002. Frequency analysis and synthesis using a Gammatone filterbank. Acta Acustica united with Acustica 88, 3 (2002), 433–442.

[11]

Eric R Kandel, James H Schwartz, Thomas M Jessell, Steven Siegelbaum, A James Hudspeth, Sarah Mack, 2000. Principles of neural science. Vol. 4. McGraw-hill New York.

[12]

Nikola Kasabov, Kshitij Dhoble, Nuttapod Nuntalid, and Giacomo Indiveri. 2013. Dynamic evolving spiking neural networks for on-line spatio-and spectro-temporal pattern recognition. Neural Networks 41 (2013), 188–201.

Digital Library

[13]

R Gary Leonard and George Doddington. 1993. Tidigits speech corpus. Texas Instruments, Inc (1993).

[14]

Paul Mermelstein. 1976. Distance measures for speech recognition, psychological and instrumental. Pattern recognition and artificial intelligence 116 (1976), 374–388.

[15]

Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, Takanobu Nishiura, and Takeshi Yamada. 2000. Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition. (2000).

[16]

Adedeji Olugboja, Zenghui Wang, and Yanxia Sun. 2021. Parallel Convolutional Neural Networks for Object Detection. Journal of Advances in Information Technology Vol 12, 4 (2021).

[17]

A Emin Orhan and Wei Ji Ma. 2015. Neural population coding of multiple stimuli. Journal of Neuroscience 35, 9 (2015), 3825–3841.

[18]

Zihan Pan, Yansong Chua, Jibin Wu, Malu Zhang, Haizhou Li, and Eliathamby Ambikairajah. 2020. An efficient and perceptually motivated auditory neural encoding and decoding algorithm for spiking neural networks. Frontiers in neuroscience 13 (2020), 1420.

[19]

Zihan Pan, Jibin Wu, Malu Zhang, Haizhou Li, and Yansong Chua. 2019. Neural population coding for effective temporal classification. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.

[20]

Roy D Patterson. 1986. Auditory filters and excitation patterns as representations of frequency resolution. Frequency selectivity in hearing (1986).

[21]

Michael Pfeiffer and Thomas Pfeil. 2018. Deep learning with spiking neurons: opportunities and challenges. Frontiers in neuroscience (2018), 774.

[22]

Yasufumi Sakai, Yu Eto, and Yuta Teranishi. 2022. Structured pruning for deep neural networks with adaptive pruning rate derivation based on connection sensitivity and loss function. Journal of Advances in Information Technology (2022).

[23]

Benjamin Schrauwen and Jan Van Campenhout. 2003. BSA, a fast and accurate spike train encoding scheme. In Proceedings of the International Joint Conference on Neural Networks, 2003., Vol. 4. IEEE, 2825–2830.

[24]

Ben J Shannon and Kuldip K Paliwal. 2003. A comparative study of filter bank spacing for speech recognition. In Microelectronic engineering research conference, Vol. 41. 310–12.

[25]

Anup Vanarse, Adam Osseiran, and Alexander Rassau. 2016. A review of current neuromorphic approaches for vision, auditory, and olfactory sensors. Frontiers in neuroscience 10 (2016), 115.

[26]

Paul J Werbos. 1990. Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 10 (1990), 1550–1560.

[27]

Jibin Wu, Yansong Chua, Malu Zhang, Haizhou Li, and Kay Chen Tan. 2018. A spiking neural network framework for robust sound classification. Frontiers in neuroscience 12 (2018), 836.

[28]

Yujie Wu, Lei Deng, Guoqi Li, Jun Zhu, and Luping Shi. 2018. Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in neuroscience 12 (2018), 331.

[29]

Rong Xiao, Rui Yan, Huajin Tang, and Kay Chen Tan. 2016. A spiking neural network model for sound recognition. In International Conference on Cognitive Systems and Signal Processing. Springer, 584–594.

[30]

Yanli Yao, Qiang Yu, Longbiao Wang, and Jianwu Dang. 2019. A spiking neural network with distributed keypoint encoding for robust sound recognition. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.

Cited By

Wang XYang CWang A(2024)A Low-Resource-Cost FPGA Implementation of Population Threshold Coding for Spiking Neural Networks2024 4th International Conference on Neural Networks, Information and Communication (NNICE)10.1109/NNICE61279.2024.10498425(73-79)Online publication date: 19-Jan-2024
https://doi.org/10.1109/NNICE61279.2024.10498425

Index Terms

A Biological Population Threshold Coding with Robust Feature Extraction and Neuronal Jitter for SNN-based Speech Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Speech recognition
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Mathematics of computing
  1. Information theory
    1. Coding theory

Recommendations

Implications of neuronal diversity on population coding

In many cortical and subcortical areas, neurons are known to modulate their average firing rate in response to certain external stimulus features. It is widely believed that information about the stimulus features is coded by a weighted average of the ...
Supervised learning in spiking neural networks with noise-threshold

With a similar capability of processing spikes as biological neural systems, networks of spiking neurons are expected to achieve a performance similar to that of living brains. Despite the achievement of spiking neuron based applications, most of them ...
Dendrites enable a robust mechanism for neuronal stimulus selectivity

Hearing, vision, touch: underlying all of these senses is stimulus selectivity, a robust information processing operation in which cortical neurons respond more to some stimuli than to others. Previous models assume that these neurons receive the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence

March 2023

824 pages

ISBN:9781450399029

DOI:10.1145/3594315

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 August 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Zhejiang Provincial Natural Science Foundation Exploration Youth Program
The Fundamental Research Funds for the Central Universities

Conference

ICCAI 2023

ICCAI 2023: 2023 9th International Conference on Computing and Artificial Intelligence

March 17 - 20, 2023

Tianjin, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
51
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang XYang CWang A(2024)A Low-Resource-Cost FPGA Implementation of Population Threshold Coding for Spiking Neural Networks2024 4th International Conference on Neural Networks, Information and Communication (NNICE)10.1109/NNICE61279.2024.10498425(73-79)Online publication date: 19-Jan-2024
https://doi.org/10.1109/NNICE61279.2024.10498425

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten