東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Computational Models for Auditory Sc...

Mobin, Shariq A.

FindBook

Google Book

Amazon

博客來

Computational Models for Auditory Scene Analysis.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Computational Models for Auditory Scene Analysis./
作者:	Mobin, Shariq A.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2019,
面頁冊數:	67 p.
附註:	Source: Dissertations Abstracts International, Volume: 81-04, Section: B.
Contained By:	Dissertations Abstracts International81-04B.
標題:	Neurosciences. -
電子資源:	https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=13885042
ISBN:	9781085794244

Computational Models for Auditory Scene Analysis.
Mobin, Shariq A.

Computational Models for Auditory Scene Analysis. - Ann Arbor : ProQuest Dissertations & Theses, 2019 - 67 p.

Source: Dissertations Abstracts International, Volume: 81-04, Section: B.

Thesis (Ph.D.)--University of California, Berkeley, 2019.

This item must not be sold to any third party vendors.

The high levels goals of this thesis are to: understand the neural representation of sound, produce more robust statistical models of natural sound, and develop models for top-down auditory attention. These are three critical concepts in the auditory system. The neural representation of sound should provide a useful representation for building robust statistical models and directing attention. Robust statistical models are necessary for humans to generalize their knowledge from one domain to the plethora of domains in the real world. And attention is fundamental to the perception of sound, allowing one to prioritize information in the raw audio signal.First, I approach understanding the neural representation of sound using the efficient coding principle and the physiological characteristics of the cochlea. A theoretical model is developed using convolutional filters and leaky-integrate-and-fire (LIF) neurons to model the cochlear transform and spiking code of the auditory nerve. The goal of this model is to explain the distributed phase code of the auditory nerve response but it lays the foundation for much more.Second, I investigate an algorithm for audio source separation, called deep clustering. Experiments are performed to evaluate it's robustness, and a new neural network architecture is developed to improve robustness. The experiments show that the conventional recurrent neural network performs sub-optimally, and our dilated convolutional neural network improves robustness while using an order of magnitude fewer parameters. This more parsimonious model is a step towards models which are minimally parameterized and generalize well across many domains.Third, I develop a new algorithm to address the limitations of the previous deep clustering method. This algorithm can extract multiple sources at once from a mixture using an attentional context or bias. It relies on modulating the computation of the bottom-up pathway using a top-down neural signal, which indicates which sources are of interest. A simple idea from the attentional spotlight method is used to do this: to allow for the top-down neural signal to modulate the gain on a set of low level neurons. This computational method demonstrates one way top-down feedback could direct auditory attention in the brain. Interestingly, this method goes beyond neuroscience, it demonstrates that attention can be about more than efficient computation. The experiments show that it resolves one of the main short comings of deep clustering. The model can extract sources from a mixture without knowing the total number of sources in the mixture, unlike deep clustering.The major contributions of this work are a theoretical model for the auditory nerve response, a more robust neural network architecture for sound understanding, and a novel and powerful model of top-down auditory attention. I hope that the first contribution will be used to build a better understanding of the complex auditory nerve code. The second to build ever more parsimonious and robust models of source separation. And the third to provide a basis for an under-explored research direction which I believe is the most fruitful for building human-level auditory scene analysis, attention-based source separation.

ISBN: 9781085794244Subjects--Topical Terms:

588700
Neurosciences.
Subjects--Index Terms:

Audio

Computational Models for Auditory Scene Analysis.
LDR:04459nmm a2200385 4500 001 2284497
005 20211123073331.5
008 220723s2019 ||||||||||||||||| ||eng d
020 $a 9781085794244
035 $a (MiAaPQ)AAI13885042
035 $a AAI13885042
040 $a MiAaPQ $c MiAaPQ
100 1 $a Mobin, Shariq A. $3 3563667
245 1 0 $a Computational Models for Auditory Scene Analysis.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2019
300 $a 67 p.
500 $a Source: Dissertations Abstracts International, Volume: 81-04, Section: B.
500 $a Includes supplementary digital materials.
500 $a Advisor: Olshausen, Bruno A.
502 $a Thesis (Ph.D.)--University of California, Berkeley, 2019.
506 $a This item must not be sold to any third party vendors.
520 $a The high levels goals of this thesis are to: understand the neural representation of sound, produce more robust statistical models of natural sound, and develop models for top-down auditory attention. These are three critical concepts in the auditory system. The neural representation of sound should provide a useful representation for building robust statistical models and directing attention. Robust statistical models are necessary for humans to generalize their knowledge from one domain to the plethora of domains in the real world. And attention is fundamental to the perception of sound, allowing one to prioritize information in the raw audio signal.First, I approach understanding the neural representation of sound using the efficient coding principle and the physiological characteristics of the cochlea. A theoretical model is developed using convolutional filters and leaky-integrate-and-fire (LIF) neurons to model the cochlear transform and spiking code of the auditory nerve. The goal of this model is to explain the distributed phase code of the auditory nerve response but it lays the foundation for much more.Second, I investigate an algorithm for audio source separation, called deep clustering. Experiments are performed to evaluate it's robustness, and a new neural network architecture is developed to improve robustness. The experiments show that the conventional recurrent neural network performs sub-optimally, and our dilated convolutional neural network improves robustness while using an order of magnitude fewer parameters. This more parsimonious model is a step towards models which are minimally parameterized and generalize well across many domains.Third, I develop a new algorithm to address the limitations of the previous deep clustering method. This algorithm can extract multiple sources at once from a mixture using an attentional context or bias. It relies on modulating the computation of the bottom-up pathway using a top-down neural signal, which indicates which sources are of interest. A simple idea from the attentional spotlight method is used to do this: to allow for the top-down neural signal to modulate the gain on a set of low level neurons. This computational method demonstrates one way top-down feedback could direct auditory attention in the brain. Interestingly, this method goes beyond neuroscience, it demonstrates that attention can be about more than efficient computation. The experiments show that it resolves one of the main short comings of deep clustering. The model can extract sources from a mixture without knowing the total number of sources in the mixture, unlike deep clustering.The major contributions of this work are a theoretical model for the auditory nerve response, a more robust neural network architecture for sound understanding, and a novel and powerful model of top-down auditory attention. I hope that the first contribution will be used to build a better understanding of the complex auditory nerve code. The second to build ever more parsimonious and robust models of source separation. And the third to provide a basis for an under-explored research direction which I believe is the most fruitful for building human-level auditory scene analysis, attention-based source separation.
590 $a School code: 0028.
650 4 $a Neurosciences. $3 588700
650 4 $a Artificial intelligence. $3 516317
650 4 $a Audiology. $3 537237
653 $a Audio
653 $a Deep learning
653 $a Hearing
653 $a Machine learning
653 $a Source separation
690 $a 0317
690 $a 0800
690 $a 0300
710 2 $a University of California, Berkeley. $b Neuroscience. $3 3191789
773 0 $t Dissertations Abstracts International $g 81-04B.
790 $a 0028
791 $a Ph.D.
792 $a 2019
793 $a English
856 4 0 $u https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=13885042