語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Supervised speech separation using d...
~
Wang, Yuxuan.
FindBook
Google Book
Amazon
博客來
Supervised speech separation using deep neural networks.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Supervised speech separation using deep neural networks./
作者:
Wang, Yuxuan.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2015,
面頁冊數:
195 p.
附註:
Source: Dissertation Abstracts International, Volume: 76-11(E), Section: B.
Contained By:
Dissertation Abstracts International76-11B(E).
標題:
Computer engineering. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3710435
ISBN:
9781321862966
Supervised speech separation using deep neural networks.
Wang, Yuxuan.
Supervised speech separation using deep neural networks.
- Ann Arbor : ProQuest Dissertations & Theses, 2015 - 195 p.
Source: Dissertation Abstracts International, Volume: 76-11(E), Section: B.
Thesis (Ph.D.)--The Ohio State University, 2015.
Speech is crucial for human communication. However, speech communication for both humans and automatic devices can be negatively impacted by background noise, which is common in real environments. Due to numerous applications, such as hearing prostheses and automatic speech recognition, separation of target speech from sound mixtures is of great importance. Among many techniques, speech separation using a single microphone is most desirable from an application standpoint. The resulting monaural speech separation problem has been a central problem in speech processing for several decades. However, its success has been limited thus far.
ISBN: 9781321862966Subjects--Topical Terms:
621879
Computer engineering.
Supervised speech separation using deep neural networks.
LDR
:04377nmm a2200373 4500
001
2160879
005
20181109093013.5
008
190424s2015 ||||||||||||||||| ||eng d
020
$a
9781321862966
035
$a
(MiAaPQ)AAI3710435
035
$a
(MiAaPQ)OhioLINK:osu1426366690
035
$a
AAI3710435
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Wang, Yuxuan.
$3
1933104
245
1 0
$a
Supervised speech separation using deep neural networks.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2015
300
$a
195 p.
500
$a
Source: Dissertation Abstracts International, Volume: 76-11(E), Section: B.
500
$a
Adviser: DeLiang Wang.
502
$a
Thesis (Ph.D.)--The Ohio State University, 2015.
520
$a
Speech is crucial for human communication. However, speech communication for both humans and automatic devices can be negatively impacted by background noise, which is common in real environments. Due to numerous applications, such as hearing prostheses and automatic speech recognition, separation of target speech from sound mixtures is of great importance. Among many techniques, speech separation using a single microphone is most desirable from an application standpoint. The resulting monaural speech separation problem has been a central problem in speech processing for several decades. However, its success has been limited thus far.
520
$a
Time-frequency (T-F) masking is a proven way to suppress background noise. With T-F masking as the computational goal, speech separation reduces to a mask estimation problem, which can be cast as a supervised learning problem. This opens speech separation to a plethora of machine learning techniques. Deep neural networks (DNN) are particularly suitable to this problem due to their strong representational capacity. This dissertation presents a systematic effort to develop monaural speech separation systems using DNNs.
520
$a
We start by presenting a comparative study on acoustic features for supervised separation. In this relatively early work, we use support vector machine as classifier to predict the ideal binary mask (IBM), which is a primary goal in computational auditory scene analysis. We found that traditional speech and speaker recognition features can actually outperform previously used separation features. Furthermore, we present a feature selection method to systematically select complementary features. The resulting feature set is used throughout the dissertation.
520
$a
DNN has shown success across a range of tasks. We then study IBM estimation using DNN, and show that it is significantly better than previous systems. Once properly trained, the system generalizes reasonably well to unseen conditions. We demonstrate that our system can improve speech intelligibility for hearing-impaired listeners. Furthermore, by considering the structure in the IBM, we show how to improve IBM estimation by employing sequence training and optimizing a speech intelligibility predictor.
520
$a
The IBM is used as the training target in previous work due to its simplicity. DNN based separation is not limited to binary masking, and choosing a suitable training target is obviously important. We study the performance of a number of targets and found that ratio masking can be preferable, and T-F masking in general outperforms spectral mapping. In addition, we propose a new target that encodes structure into ratio masks.
520
$a
Generalization to noises not seen during training is key to supervised separation. A simple and effective way to improve generalization is to train on multiple noisy conditions. Along this line, we demonstrate that the noise mismatch problem can be well remedied by large-scale training. This important result substantiates the practicability of DNN based supervised separation.
520
$a
Aside from speech intelligibility, perceptual quality is also important. In the last part of the dissertation, we propose a new DNN architecture that directly reconstructs time-domain clean speech signal. The resulting system significantly improves objective speech quality over standard mask estimators.
590
$a
School code: 0168.
650
4
$a
Computer engineering.
$3
621879
650
4
$a
Computer science.
$3
523869
690
$a
0464
690
$a
0984
710
2
$a
The Ohio State University.
$b
Computer Science and Engineering.
$3
1674144
773
0
$t
Dissertation Abstracts International
$g
76-11B(E).
790
$a
0168
791
$a
Ph.D.
792
$a
2015
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3710435
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9360426
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入