東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Recognition and characterization of ...

Chu, Selina.

FindBook

Google Book

Amazon

博客來

Recognition and characterization of unstructured environmental sounds.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Recognition and characterization of unstructured environmental sounds./
作者:	Chu, Selina.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2011,
面頁冊數:	124 p.
附註:	Source: Dissertation Abstracts International, Volume: 72-10, Section: B, page: 6102.
Contained By:	Dissertation Abstracts International72-10B.
標題:	Computer science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3465966
ISBN:	9781124786971

Recognition and characterization of unstructured environmental sounds.
Chu, Selina.

Recognition and characterization of unstructured environmental sounds. - Ann Arbor : ProQuest Dissertations & Theses, 2011 - 124 p.

Source: Dissertation Abstracts International, Volume: 72-10, Section: B, page: 6102.

Thesis (Ph.D.)--University of Southern California, 2011.

Environmental sounds are what we hear everyday, or more generally sounds that surround us ambient or background audio. Humans utilize both vision and hearing to respond to their surroundings, a capability still quite limited in machine processing. The first step toward achieving multimodal input applications is the ability to process unstructured audio and recognize audio scenes (or environments). Such ability would have applications in content analysis and mining of multimedia data or improving robustness in context aware applications through multi-modality, such as in assistive robotics, surveillances, or mobile device-based services.

ISBN: 9781124786971Subjects--Topical Terms:

523869
Computer science.

Recognition and characterization of unstructured environmental sounds.
LDR:06384nmm a2200361 4500 001 2121975
005 20170830070102.5
008 180830s2011 ||||||||||||||||| ||eng d
020 $a 9781124786971
035 $a (MiAaPQ)AAI3465966
035 $a AAI3465966
040 $a MiAaPQ $c MiAaPQ
100 1 $a Chu, Selina. $3 3283917
245 1 0 $a Recognition and characterization of unstructured environmental sounds.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2011
300 $a 124 p.
500 $a Source: Dissertation Abstracts International, Volume: 72-10, Section: B, page: 6102.
500 $a Advisers: Shrikanth Narayanan; C.-C. Jay Kuo.
502 $a Thesis (Ph.D.)--University of Southern California, 2011.
520 $a Environmental sounds are what we hear everyday, or more generally sounds that surround us ambient or background audio. Humans utilize both vision and hearing to respond to their surroundings, a capability still quite limited in machine processing. The first step toward achieving multimodal input applications is the ability to process unstructured audio and recognize audio scenes (or environments). Such ability would have applications in content analysis and mining of multimedia data or improving robustness in context aware applications through multi-modality, such as in assistive robotics, surveillances, or mobile device-based services.
520 $a The goal of this thesis is on the characterization of unstructured environmental sounds for understanding and predicting the context surrounding of an agent or device. Most research on audio recognition has focused primarily on speech and music. Less attention has been paid to the challenges and opportunities for using audio to characterize unstructured audio. My research focuses on investigating challenging issues in characterizing unstructured environmental audio and to develop novel algorithms for modeling the variations of the environment.
520 $a The first step in building a recognition system for unstructured auditory environment was to investigate on techniques and audio features for working with such audio data. We begin by performing a study that explore suitable features and the feasibility of designing an automatic environment recognition system using audio information. In my initial investigation to explore the feasibility of designing an automatic environment recognition system using audio information, I have found that traditional recognition and feature extraction for audio were not suitable for environmental sound, as they lack any type of structures, unlike those of speech and music which contain formantic and harmonic structures, thus dispelling the notion that traditional speech and music recognition techniques can simply be used for realistic environmental sound.
520 $a Natural unstructured environment sounds contain a large variety of sounds, which are in fact noise-like and are not effectively modeled by Mel-frequency cepstral coefficients (MFCCs) or other commonly-used audio features, e.g. energy, zero-crossing, etc. Due to the lack of appropriate features that is suitable for environmental audio and to achieve a more effective representation, I proposed a specialized feature extraction algorithm for environmental sounds that utilizes the matching pursuit (MP) algorithm to learn the inherent structure of each type of sounds, which we called MP-features. MP-features have shown to capture and represent sounds from different sources and different ranges, where frequency domain features (e.g., MFCCs) fail and can be advantageous when combining with MFCCs to improve the overall performance.
520 $a The third component leads to our investigation on modeling and detecting the background audio. One of the goals of this research is to characterize an environment. Since many events would blend into the background, I wanted to look for a way to achieve a general model for any particular environment. Once we have an idea of the background, it will enable us to identify foreground events even if we havent seen these events before. Therefore, the next step is to investigate into learning the audio background model for each environment type, despite the occurrences of different foreground events. In this work, I presented a framework for robust audio background modeling, which includes learning the models for prediction, data knowledge and persistent characteristics of the environment. This approach has the ability to model the background and detect foreground events as well as the ability to verify whether the predicted background is indeed the background or a foreground event that protracts for a longer period of time. In this work, I also investigated the use of a semi-supervised learning technique to exploit and label new unlabeled audio data.
520 $a The final components of my thesis will involve investigating on learning sound structures for generalization and applying the proposed ideas to context aware applications. The inherent nature of environmental sound is noisy and contains relatively large amounts of overlapping events between different environments. Environmental sounds contain large variances even within a single environment type, and frequently, there are no divisible or clear boundaries between some types. Traditional methods of classification are generally not robust enough to handle classes with overlaps. This audio, hence, requires representation by complex models. Using deep learning architecture provides a way to obtain a generative model-based method for classification. Specifically, I considered the use of Deep Belief Networks (DBNs) to model environmental audio and investigate its applicability with noisy data to improve robustness and generalization. A framework was proposed using composite-DBNs to discover high-level representations and to learn a hierarchical structure for different acoustic environments in a data-driven fashion. Experimental results on real data sets demonstrate its effectiveness over traditional methods with over 90% accuracy on recognition for a high number of environmental sound types.
590 $a School code: 0208.
650 4 $a Computer science. $3 523869
650 4 $a Acoustics. $3 879105
650 4 $a Electrical engineering. $3 649834
690 $a 0984
690 $a 0986
690 $a 0544
710 2 $a University of Southern California. $b Computer Science. $3 1023331
773 0 $t Dissertation Abstracts International $g 72-10B.
790 $a 0208
791 $a Ph.D.
792 $a 2011
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3465966