東華大學圖書館 |

Man-machine speech communication = 18th National Conference, NCMMSC 2023, Suzhou, China, December 8-10, 2023 : proceedings /

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Man-machine speech communication/ edited by Jia Jia ... [et al.].
其他題名:	18th National Conference, NCMMSC 2023, Suzhou, China, December 8-10, 2023 : proceedings /
其他題名:	NCMMSC 2023
其他作者:	Jia, Jia.
團體作者:	National Conference on Man-Machine Speech Communication
出版者:	Singapore :Springer Nature Singapore : : 2024.,
面頁冊數:	xiv, 368 p. :ill. (chiefly col.), digital ;24 cm.
內容註:	Ultra-Low Complexity Residue Echo and Noise Suppression Based on Recurrent Neural Network -- Semi-End-to-End Nested Named Entity Recognition from Speech -- A Lightweight Music Source Separation Model with Graph Convolution Network -- Joint time-domain and frequency-domain progressive learning for single-channel speech enhancement and recognition -- A Study on Domain Adaptation for Audio-visual Speech Enhancement -- APNet2: High-quality and High-efficiency Neural Vocoder with Direct Prediction of Amplitude and Phase Spectra -- Within- and Between-Class Sample Interpolation Based Supervised Metric Learning for Speaker Verification -- Joint speech and noise estimation using SNR-adaptive target learning for deep-learning-based speech enhancement -- Data Augmentation By Finite Element Analysis for Enhanced Machine Anomalous Sound Detection -- A Fast Sampling Method in Diffusion-based Dance Generation Models -- End-to-end Streaming Customizable Keyword Spotting based on text-adaptive neural search -- The Production of Successive Addition Boundary Tone in Mandarin Preschoolers -- Emotional Support Dialog System Through Recursive Interactions Among Large Language Models -- Task-Adaptive Generative Adversarial Network based Speech Dereverberation for Robust Speech Recognition -- Real-time Automotive Engine Sound Simulation with Deep Neural Network -- A Framework Combining Separate and Joint Training for Neural Vocoder-Based Monaural Speech Enhancement -- Accent-VITS: accent transfer for end-to-end TTS -- Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection -- A Packet Loss Concealment Method Based on the Demucs Network Structure -- Improving Speech Perceptual Quality and Intelligibility through Sub-band Temporal Envelope Characteristics -- Adaptive Deep Graph Convolutional Network For Dialogical Speech Emotion Recognition -- Iterative Noisy-target Approach: Speech Enhancement without Clean Speech -- Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization -- Zero-shot Singing Voice Conversion Method Based on Timbre Space Modeling and Excitation Signal Control -- A Comparative Study of Pre-trained Audio and Speech Models for Heart Sound Detection -- CAM-GUI: A Conversational Assistant on Mobile GUI -- A Pilot Study on the Prosodic Factors Influencing Voice Attractiveness of AI Speech -- The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023 -- Chinese EFL Learners' Auditory and Visual Perception of English Statement and Question Intonation: The Effect of Stress -- An Improved System for Partially Fake Audio Detection Using Pre-trained Model -- Leveraging Synthetic Speech for CIF-based Customized Keyword Spotting.
Contained By:	Springer Nature eBook
標題:	Computational linguistics - Congresses. -
電子資源:	https://doi.org/10.1007/978-981-97-0601-3
ISBN:	9789819706013

Man-machine speech communication = 18th National Conference, NCMMSC 2023, Suzhou, China, December 8-10, 2023 : proceedings /
Man-machine speech communication18th National Conference, NCMMSC 2023, Suzhou, China, December 8-10, 2023 : proceedings /[electronic resource] :NCMMSC 2023edited by Jia Jia ... [et al.]. - Singapore :Springer Nature Singapore :2024. - xiv, 368 p. :ill. (chiefly col.), digital ;24 cm. - Communications in computer and information science,20061865-0937 ;. - Communications in computer and information science ;2006..

Ultra-Low Complexity Residue Echo and Noise Suppression Based on Recurrent Neural Network -- Semi-End-to-End Nested Named Entity Recognition from Speech -- A Lightweight Music Source Separation Model with Graph Convolution Network -- Joint time-domain and frequency-domain progressive learning for single-channel speech enhancement and recognition -- A Study on Domain Adaptation for Audio-visual Speech Enhancement -- APNet2: High-quality and High-efficiency Neural Vocoder with Direct Prediction of Amplitude and Phase Spectra -- Within- and Between-Class Sample Interpolation Based Supervised Metric Learning for Speaker Verification -- Joint speech and noise estimation using SNR-adaptive target learning for deep-learning-based speech enhancement -- Data Augmentation By Finite Element Analysis for Enhanced Machine Anomalous Sound Detection -- A Fast Sampling Method in Diffusion-based Dance Generation Models -- End-to-end Streaming Customizable Keyword Spotting based on text-adaptive neural search -- The Production of Successive Addition Boundary Tone in Mandarin Preschoolers -- Emotional Support Dialog System Through Recursive Interactions Among Large Language Models -- Task-Adaptive Generative Adversarial Network based Speech Dereverberation for Robust Speech Recognition -- Real-time Automotive Engine Sound Simulation with Deep Neural Network -- A Framework Combining Separate and Joint Training for Neural Vocoder-Based Monaural Speech Enhancement -- Accent-VITS: accent transfer for end-to-end TTS -- Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection -- A Packet Loss Concealment Method Based on the Demucs Network Structure -- Improving Speech Perceptual Quality and Intelligibility through Sub-band Temporal Envelope Characteristics -- Adaptive Deep Graph Convolutional Network For Dialogical Speech Emotion Recognition -- Iterative Noisy-target Approach: Speech Enhancement without Clean Speech -- Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization -- Zero-shot Singing Voice Conversion Method Based on Timbre Space Modeling and Excitation Signal Control -- A Comparative Study of Pre-trained Audio and Speech Models for Heart Sound Detection -- CAM-GUI: A Conversational Assistant on Mobile GUI -- A Pilot Study on the Prosodic Factors Influencing Voice Attractiveness of AI Speech -- The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023 -- Chinese EFL Learners' Auditory and Visual Perception of English Statement and Question Intonation: The Effect of Stress -- An Improved System for Partially Fake Audio Detection Using Pre-trained Model -- Leveraging Synthetic Speech for CIF-based Customized Keyword Spotting.

This book constitutes the refereed proceedings of the 18th National Conference on Man-Machine Speech Communication, NCMMSC 2023, held in Suzhou, China, during December 8-11, 2023. The 20 full papers and 11 short papers included in this book were carefully reviewed and selected from 117 submissions. They deal with topics such as speech recognition, synthesis, enhancement and coding, audio/music/singing synthesis, avatar, speaker recognition and verification, human-computer dialogue systems, large language models as well as phonetic and linguistic topics such as speech prosody analysis, pathological speech analysis, experimental phonetics, acoustic scene classification.

ISBN: 9789819706013

Standard No.: 10.1007/978-981-97-0601-3doiSubjects--Topical Terms:

582158
Computational linguistics
--Congresses.

LC Class. No.: QA76.9.N38

Dewey Class. No.: 006.35

Man-machine speech communication = 18th National Conference, NCMMSC 2023, Suzhou, China, December 8-10, 2023 : proceedings /
LDR:04618nmm a2200349 a 4500 001 2368608
003 DE-He213
005 20240214192928.0
006 m d
007 cr nn 008maaau
008 241127s2024 si s 0 eng d
020 $a 9789819706013 $q (electronic bk.)
020 $a 9789819706006 $q (paper)
024 7 $a 10.1007/978-981-97-0601-3 $2 doi
035 $a 978-981-97-0601-3
040 $a GP $c GP
041 0 $a eng
050 4 $a QA76.9.N38
072 7 $a UYQV $2 bicssc
072 7 $a COM012000 $2 bisacsh
072 7 $a UYQV $2 thema
082 0 4 $a 006.35 $2 23
090 $a QA76.9.N38 $b N277 2023
111 2 $a National Conference on Man-Machine Speech Communication $n (18th : $d 2023 : $c Suzhou, China) $3 3712622
245 1 0 $a Man-machine speech communication $h [electronic resource] : $b 18th National Conference, NCMMSC 2023, Suzhou, China, December 8-10, 2023 : proceedings / $c edited by Jia Jia ... [et al.].
246 3 $a NCMMSC 2023
260 $a Singapore : $b Springer Nature Singapore : $b Imprint: Springer, $c 2024.
300 $a xiv, 368 p. : $b ill. (chiefly col.), digital ; $c 24 cm.
490 1 $a Communications in computer and information science, $x 1865-0937 ; $v 2006
505 0 $a Ultra-Low Complexity Residue Echo and Noise Suppression Based on Recurrent Neural Network -- Semi-End-to-End Nested Named Entity Recognition from Speech -- A Lightweight Music Source Separation Model with Graph Convolution Network -- Joint time-domain and frequency-domain progressive learning for single-channel speech enhancement and recognition -- A Study on Domain Adaptation for Audio-visual Speech Enhancement -- APNet2: High-quality and High-efficiency Neural Vocoder with Direct Prediction of Amplitude and Phase Spectra -- Within- and Between-Class Sample Interpolation Based Supervised Metric Learning for Speaker Verification -- Joint speech and noise estimation using SNR-adaptive target learning for deep-learning-based speech enhancement -- Data Augmentation By Finite Element Analysis for Enhanced Machine Anomalous Sound Detection -- A Fast Sampling Method in Diffusion-based Dance Generation Models -- End-to-end Streaming Customizable Keyword Spotting based on text-adaptive neural search -- The Production of Successive Addition Boundary Tone in Mandarin Preschoolers -- Emotional Support Dialog System Through Recursive Interactions Among Large Language Models -- Task-Adaptive Generative Adversarial Network based Speech Dereverberation for Robust Speech Recognition -- Real-time Automotive Engine Sound Simulation with Deep Neural Network -- A Framework Combining Separate and Joint Training for Neural Vocoder-Based Monaural Speech Enhancement -- Accent-VITS: accent transfer for end-to-end TTS -- Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection -- A Packet Loss Concealment Method Based on the Demucs Network Structure -- Improving Speech Perceptual Quality and Intelligibility through Sub-band Temporal Envelope Characteristics -- Adaptive Deep Graph Convolutional Network For Dialogical Speech Emotion Recognition -- Iterative Noisy-target Approach: Speech Enhancement without Clean Speech -- Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization -- Zero-shot Singing Voice Conversion Method Based on Timbre Space Modeling and Excitation Signal Control -- A Comparative Study of Pre-trained Audio and Speech Models for Heart Sound Detection -- CAM-GUI: A Conversational Assistant on Mobile GUI -- A Pilot Study on the Prosodic Factors Influencing Voice Attractiveness of AI Speech -- The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023 -- Chinese EFL Learners' Auditory and Visual Perception of English Statement and Question Intonation: The Effect of Stress -- An Improved System for Partially Fake Audio Detection Using Pre-trained Model -- Leveraging Synthetic Speech for CIF-based Customized Keyword Spotting.
520 $a This book constitutes the refereed proceedings of the 18th National Conference on Man-Machine Speech Communication, NCMMSC 2023, held in Suzhou, China, during December 8-11, 2023. The 20 full papers and 11 short papers included in this book were carefully reviewed and selected from 117 submissions. They deal with topics such as speech recognition, synthesis, enhancement and coding, audio/music/singing synthesis, avatar, speaker recognition and verification, human-computer dialogue systems, large language models as well as phonetic and linguistic topics such as speech prosody analysis, pathological speech analysis, experimental phonetics, acoustic scene classification.
650 0 $a Computational linguistics $x Congresses. $3 582158
650 0 $a Natural language processing (Computer science) $v Congresses. $3 752585
650 0 $a Human-computer interaction $x Congresses. $3 705966
650 1 4 $a Computer Vision. $3 3538524
650 2 4 $a Natural Language Processing (NLP) $3 3381674
650 2 4 $a Signal, Speech and Image Processing. $3 3592727
650 2 4 $a Artificial Intelligence. $3 769149
650 2 4 $a User Interfaces and Human Computer Interaction. $3 892554
700 1 $a Jia, Jia. $3 3712623
710 2 $a SpringerLink (Online service) $3 836513
773 0 $t Springer Nature eBook
830 0 $a Communications in computer and information science ; $v 2006. $3 3712624
856 4 0 $u https://doi.org/10.1007/978-981-97-0601-3
950 $a Computer Science (SpringerNature-11645)