東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Speech and computer = 24th Internati...

International Conference Speech and Computer (2022 :)

FindBook

Google Book

Amazon

博客來

Speech and computer = 24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022 : proceedings /

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Speech and computer/ edited by S. R. Mahadeva Prasanna ... [et al.].
其他題名:	24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022 : proceedings /
其他題名:	SPECOM 2022
其他作者:	Prasanna, S. R. Mahadeva.
團體作者:	International Conference Speech and Computer
出版者:	Cham :Springer International Publishing : : 2022.,
面頁冊數:	xvi, 720 p. :ill. (chiefly color), digital ;24 cm.
內容註:	Thematic Diversity of Everyday Russian Discourse: a Case Study Based on the ORD corpus -- Neural Embedding Extractors for Text-Independent Speaker Verification -- Deep Speaker Embeddings based Online Diarization -- Overlapped Speech Detection Using AM-FM based Time-Frequency Representations -- Significance of Dimensionality Reduction in CNN-based Vowel Classification from Imagined Speech using Electroencephalogram Signals -- Study of Speech Recognition System Based on Transformer and Connectionist Temporal Classification Models for Low Resource Language -- An Initial Study on Birdsong Re-synthesis using Neural Vocoders -- Speech Music Overlap Detection using Spectral Peak Evolutions -- Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English -- ClusterVote: Automatic Summarization Dataset Construction with Document Clusters -- Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples -- Celtic English Continuum in Pitch Patterns of Spontane-ous Talk: Evidence of Long-Term Contacts -- Coherence Based Automatic Essay Scoring Using Sentence Embedding and Recurrent Neural Networks -- Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore Vs BLEU Score -- DyCoDa: A Multi-Modal Data Collection of Multi-User Remote Survival Game Recordings -- On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness Detection -- Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake Detection -- Automatic Rhythm and Speech Rate Analysis of Mising Spontaneous Speech -- An Electroglottographic Method for Assessing the Emotional State of the Speaker -- Significance of Distance on Pop Noise for Voice Liveness Detection -- CRIM's Speech Recognition System for OpenASR21 Evaluation with Conformer and Voice Activity Detector Embeddings -- Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise - a New Statistical Approach -- Comparing NLP Solutions for the Disambiguation of French Heterophonic Homographs for End-to-End TTS Systems -- Detection of Speech Related Disorders by Pre-Trained Embedding Models Extracted Biomarkers -- Multi-Label Dysfluency Classification -- Harnessing Uncertainty - Multi-Label Dysfluency Classification with Uncertain Labels -- Continuous Wavelet Transform for Severity-Level Classification of Dysarthria -- Significance of Energy Features for Severity Classification of Dysarthria -- Sailor and Hemant A. Patil An Analytic Study on Clustering-based Pseudo-Labels for Self-Supervised Deep Speaker Verification -- Investigation of Transfer Learning for End-to-End Russian Speech Recognition -- Prosodic Features of Verbal Irony in Russian and French: Universal vs. Language-Specific -- Categorization of Threatening Speech Acts -- Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem -- Multi-level Fusion of Fisher Vector Encoded BERT and wav2vec 2.0 Embeddings for Native Language Identification -- Fake Speech Detection using OpenSMILE Features -- Nonverbal Constituents of Argumentative Discourse: Gesture and Prosody Interaction -- Classifying Mahout and Social Interactions of Asian Elephants based on Trumpet Calls -- Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic -- Fake Speech Detection using Modulation Spectrogram -- Self-Configuring Genetic Programming Feature Generation in Affect Recognition Tasks -- A Multi[1]Modal Approach to Mining Intent from Code-Mixed Hindi-English Calls in the Hyperlocal-Delivery Domain -- Importance of Supra-Segmental Information and Self-Supervised Framework for Spoken Language -- Diarization Task -- Low-resource Emotional Speech Synthesis: Transfer Learning, Data requirements and Adversarial Training -- Fuzzy Classifier For Speech Assessment in Speech Rehabilitation -- Analysis-by-Synthesis Modeling of Bengali Intonation -- Neural Network Based Curve Fitting to Enhance the Intelligibility of Dysarthric Speech -- Retrieval-based Dialogue Agents -- Forensic Identification of Foreign-Language Speakers by the Method of Structural-Melodic Analysis of Phonograms -- Logistics Translator. Concept Vision on Future Interlanguage Computer Assisted Translation -- Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification -- Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners -- Emotional Speech Recognition Based on Lip-Reading -- Exploring The Use of Machine Learning for Resume Recommendations -- The Role of Pause in Interaction: A Case of Polylogue -- Dictionary with the Evaluation of Positivity/Negativity Degree of the Russian Words -- Effects of Depth of Field on Focus using a Virtual Reality Escape Room -- Dynamics of Frequency Characteristics of Visually Evoked Potentials of Electroencephalography During the Work with Brain-Computer Interfaces -- Device Robust Acoustic Scene Classification using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network -- Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology -- Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022.
Contained By:	Springer Nature eBook
標題:	Natural language processing (Computer science) - Congresses. -
電子資源:	https://doi.org/10.1007/978-3-031-20980-2
ISBN:	9783031209802

Speech and computer = 24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022 : proceedings /
Speech and computer24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022 : proceedings /[electronic resource] :SPECOM 2022edited by S. R. Mahadeva Prasanna ... [et al.]. - Cham :Springer International Publishing :2022. - xvi, 720 p. :ill. (chiefly color), digital ;24 cm. - Lecture notes in computer science,137210302-9743 ;. - Lecture notes in computer science ;13721..

Thematic Diversity of Everyday Russian Discourse: a Case Study Based on the ORD corpus -- Neural Embedding Extractors for Text-Independent Speaker Verification -- Deep Speaker Embeddings based Online Diarization -- Overlapped Speech Detection Using AM-FM based Time-Frequency Representations -- Significance of Dimensionality Reduction in CNN-based Vowel Classification from Imagined Speech using Electroencephalogram Signals -- Study of Speech Recognition System Based on Transformer and Connectionist Temporal Classification Models for Low Resource Language -- An Initial Study on Birdsong Re-synthesis using Neural Vocoders -- Speech Music Overlap Detection using Spectral Peak Evolutions -- Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English -- ClusterVote: Automatic Summarization Dataset Construction with Document Clusters -- Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples -- Celtic English Continuum in Pitch Patterns of Spontane-ous Talk: Evidence of Long-Term Contacts -- Coherence Based Automatic Essay Scoring Using Sentence Embedding and Recurrent Neural Networks -- Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore Vs BLEU Score -- DyCoDa: A Multi-Modal Data Collection of Multi-User Remote Survival Game Recordings -- On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness Detection -- Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake Detection -- Automatic Rhythm and Speech Rate Analysis of Mising Spontaneous Speech -- An Electroglottographic Method for Assessing the Emotional State of the Speaker -- Significance of Distance on Pop Noise for Voice Liveness Detection -- CRIM's Speech Recognition System for OpenASR21 Evaluation with Conformer and Voice Activity Detector Embeddings -- Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise - a New Statistical Approach -- Comparing NLP Solutions for the Disambiguation of French Heterophonic Homographs for End-to-End TTS Systems -- Detection of Speech Related Disorders by Pre-Trained Embedding Models Extracted Biomarkers -- Multi-Label Dysfluency Classification -- Harnessing Uncertainty - Multi-Label Dysfluency Classification with Uncertain Labels -- Continuous Wavelet Transform for Severity-Level Classification of Dysarthria -- Significance of Energy Features for Severity Classification of Dysarthria -- Sailor and Hemant A. Patil An Analytic Study on Clustering-based Pseudo-Labels for Self-Supervised Deep Speaker Verification -- Investigation of Transfer Learning for End-to-End Russian Speech Recognition -- Prosodic Features of Verbal Irony in Russian and French: Universal vs. Language-Specific -- Categorization of Threatening Speech Acts -- Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem -- Multi-level Fusion of Fisher Vector Encoded BERT and wav2vec 2.0 Embeddings for Native Language Identification -- Fake Speech Detection using OpenSMILE Features -- Nonverbal Constituents of Argumentative Discourse: Gesture and Prosody Interaction -- Classifying Mahout and Social Interactions of Asian Elephants based on Trumpet Calls -- Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic -- Fake Speech Detection using Modulation Spectrogram -- Self-Configuring Genetic Programming Feature Generation in Affect Recognition Tasks -- A Multi[1]Modal Approach to Mining Intent from Code-Mixed Hindi-English Calls in the Hyperlocal-Delivery Domain -- Importance of Supra-Segmental Information and Self-Supervised Framework for Spoken Language -- Diarization Task -- Low-resource Emotional Speech Synthesis: Transfer Learning, Data requirements and Adversarial Training -- Fuzzy Classifier For Speech Assessment in Speech Rehabilitation -- Analysis-by-Synthesis Modeling of Bengali Intonation -- Neural Network Based Curve Fitting to Enhance the Intelligibility of Dysarthric Speech -- Retrieval-based Dialogue Agents -- Forensic Identification of Foreign-Language Speakers by the Method of Structural-Melodic Analysis of Phonograms -- Logistics Translator. Concept Vision on Future Interlanguage Computer Assisted Translation -- Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification -- Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners -- Emotional Speech Recognition Based on Lip-Reading -- Exploring The Use of Machine Learning for Resume Recommendations -- The Role of Pause in Interaction: A Case of Polylogue -- Dictionary with the Evaluation of Positivity/Negativity Degree of the Russian Words -- Effects of Depth of Field on Focus using a Virtual Reality Escape Room -- Dynamics of Frequency Characteristics of Visually Evoked Potentials of Electroencephalography During the Work with Brain-Computer Interfaces -- Device Robust Acoustic Scene Classification using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network -- Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology -- Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022.

This book constitutes the proceedings of the 24th International Conference on Speech and Computer, SPECOM 2022, held as a hybrid event in Gurugram, India, in November 2022. The 51 full and 9 short papers presented in this volume were carefully reviewed and selected from 99 submissions. The papers present current research in the area of computer speech processing including audio signal processing, automatic speech recognition, speaker recognition, computational paralinguistics, speech synthesis, sign language and multimodal processing, and speech and language resources.

ISBN: 9783031209802

Standard No.: 10.1007/978-3-031-20980-2doiSubjects--Topical Terms:

752585
Natural language processing (Computer science)
--Congresses.

LC Class. No.: QA76.9.N38

Dewey Class. No.: 006.35

Speech and computer = 24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022 : proceedings /
LDR:07135nmm a2200361 a 4500 001 2305525
003 DE-He213
005 20221122142252.0
006 m d
007 cr nn 008maaau
008 230409s2022 sz s 0 eng d
020 $a 9783031209802 $q (electronic bk.)
020 $a 9783031209796 $q (paper)
024 7 $a 10.1007/978-3-031-20980-2 $2 doi
035 $a 978-3-031-20980-2
040 $a GP $c GP
041 0 $a eng
050 4 $a QA76.9.N38
072 7 $a UYQ $2 bicssc
072 7 $a COM004000 $2 bisacsh
072 7 $a UYQ $2 thema
082 0 4 $a 006.35 $2 23
090 $a QA76.9.N38 $b I61 2022
111 2 $a International Conference Speech and Computer $n (24th : $d 2022 : $c Gurgaon, India) $3 3608671
245 1 0 $a Speech and computer $h [electronic resource] : $b 24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022 : proceedings / $c edited by S. R. Mahadeva Prasanna ... [et al.].
246 3 $a SPECOM 2022
260 $a Cham : $b Springer International Publishing : $b Imprint: Springer, $c 2022.
300 $a xvi, 720 p. : $b ill. (chiefly color), digital ; $c 24 cm.
490 1 $a Lecture notes in computer science, $x 0302-9743 ; $v 13721
490 1 $a Lecture notes in artificial intelligence
505 0 $a Thematic Diversity of Everyday Russian Discourse: a Case Study Based on the ORD corpus -- Neural Embedding Extractors for Text-Independent Speaker Verification -- Deep Speaker Embeddings based Online Diarization -- Overlapped Speech Detection Using AM-FM based Time-Frequency Representations -- Significance of Dimensionality Reduction in CNN-based Vowel Classification from Imagined Speech using Electroencephalogram Signals -- Study of Speech Recognition System Based on Transformer and Connectionist Temporal Classification Models for Low Resource Language -- An Initial Study on Birdsong Re-synthesis using Neural Vocoders -- Speech Music Overlap Detection using Spectral Peak Evolutions -- Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English -- ClusterVote: Automatic Summarization Dataset Construction with Document Clusters -- Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples -- Celtic English Continuum in Pitch Patterns of Spontane-ous Talk: Evidence of Long-Term Contacts -- Coherence Based Automatic Essay Scoring Using Sentence Embedding and Recurrent Neural Networks -- Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore Vs BLEU Score -- DyCoDa: A Multi-Modal Data Collection of Multi-User Remote Survival Game Recordings -- On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness Detection -- Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake Detection -- Automatic Rhythm and Speech Rate Analysis of Mising Spontaneous Speech -- An Electroglottographic Method for Assessing the Emotional State of the Speaker -- Significance of Distance on Pop Noise for Voice Liveness Detection -- CRIM's Speech Recognition System for OpenASR21 Evaluation with Conformer and Voice Activity Detector Embeddings -- Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise - a New Statistical Approach -- Comparing NLP Solutions for the Disambiguation of French Heterophonic Homographs for End-to-End TTS Systems -- Detection of Speech Related Disorders by Pre-Trained Embedding Models Extracted Biomarkers -- Multi-Label Dysfluency Classification -- Harnessing Uncertainty - Multi-Label Dysfluency Classification with Uncertain Labels -- Continuous Wavelet Transform for Severity-Level Classification of Dysarthria -- Significance of Energy Features for Severity Classification of Dysarthria -- Sailor and Hemant A. Patil An Analytic Study on Clustering-based Pseudo-Labels for Self-Supervised Deep Speaker Verification -- Investigation of Transfer Learning for End-to-End Russian Speech Recognition -- Prosodic Features of Verbal Irony in Russian and French: Universal vs. Language-Specific -- Categorization of Threatening Speech Acts -- Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem -- Multi-level Fusion of Fisher Vector Encoded BERT and wav2vec 2.0 Embeddings for Native Language Identification -- Fake Speech Detection using OpenSMILE Features -- Nonverbal Constituents of Argumentative Discourse: Gesture and Prosody Interaction -- Classifying Mahout and Social Interactions of Asian Elephants based on Trumpet Calls -- Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic -- Fake Speech Detection using Modulation Spectrogram -- Self-Configuring Genetic Programming Feature Generation in Affect Recognition Tasks -- A Multi[1]Modal Approach to Mining Intent from Code-Mixed Hindi-English Calls in the Hyperlocal-Delivery Domain -- Importance of Supra-Segmental Information and Self-Supervised Framework for Spoken Language -- Diarization Task -- Low-resource Emotional Speech Synthesis: Transfer Learning, Data requirements and Adversarial Training -- Fuzzy Classifier For Speech Assessment in Speech Rehabilitation -- Analysis-by-Synthesis Modeling of Bengali Intonation -- Neural Network Based Curve Fitting to Enhance the Intelligibility of Dysarthric Speech -- Retrieval-based Dialogue Agents -- Forensic Identification of Foreign-Language Speakers by the Method of Structural-Melodic Analysis of Phonograms -- Logistics Translator. Concept Vision on Future Interlanguage Computer Assisted Translation -- Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification -- Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners -- Emotional Speech Recognition Based on Lip-Reading -- Exploring The Use of Machine Learning for Resume Recommendations -- The Role of Pause in Interaction: A Case of Polylogue -- Dictionary with the Evaluation of Positivity/Negativity Degree of the Russian Words -- Effects of Depth of Field on Focus using a Virtual Reality Escape Room -- Dynamics of Frequency Characteristics of Visually Evoked Potentials of Electroencephalography During the Work with Brain-Computer Interfaces -- Device Robust Acoustic Scene Classification using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network -- Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology -- Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022.
520 $a This book constitutes the proceedings of the 24th International Conference on Speech and Computer, SPECOM 2022, held as a hybrid event in Gurugram, India, in November 2022. The 51 full and 9 short papers presented in this volume were carefully reviewed and selected from 99 submissions. The papers present current research in the area of computer speech processing including audio signal processing, automatic speech recognition, speaker recognition, computational paralinguistics, speech synthesis, sign language and multimodal processing, and speech and language resources.
650 0 $a Natural language processing (Computer science) $v Congresses. $3 752585
650 0 $a Automatic speech recognition $v Congresses. $3 840482
650 0 $a Speech processing systems $x Congresses. $3 678615
650 0 $a Human-computer interaction $x Congresses. $3 705966
650 0 $a Linguistics $v Congresses. $3 792572
700 1 $a Prasanna, S. R. Mahadeva. $3 899643
710 2 $a SpringerLink (Online service) $3 836513
773 0 $t Springer Nature eBook
830 0 $a Lecture notes in computer science ; $v 13721. $3 3608672
830 0 $a Lecture notes in artificial intelligence. $3 3382562
856 4 0 $u https://doi.org/10.1007/978-3-031-20980-2
950 $a Computer Science (SpringerNature-11645)