東華大學圖書館 |

Text, speech, and dialogue = 27th International Conference, TSD 2024, Brno, Czech Republic, September 9-13, 2024 : proceedingss.. Part II /

Record Type:	Electronic resources : Monograph/item
Title/Author:	Text, speech, and dialogue/ edited by Elmar Nöth, Aleš Horák, Petr Sojka.
Reminder of title:	27th International Conference, TSD 2024, Brno, Czech Republic, September 9-13, 2024 : proceedingss.
remainder title:	TSD 2024
other author:	Nöth, Elmar.
corporate name:	TSD (Conference)
Published:	Cham :Springer Nature Switzerland : : 2024.,
Description:	xvii, 326 p. :ill. (some col.), digital ;24 cm.
[NT 15003449]:	Speech. -- Retrieval Augmented Spoken Language Generation for Transport Domain. -- Adapting Audiovisual Speech Synthesis to Estonian. -- Dysphonia Diagnosis Using Self-Supervised Speech Models in Mono- and Cross-Lingual Settings. -- Sentences vs Phrases in Neural Speech Synthesis. -- Zero-Shot vs. Few-Shot Multi-Speaker TTS Using Pre-trained Czech SpeechT5 Model. -- Deep Speaker Embeddings for Speaker Verification of Children. -- Improved Alignment for Score Combination of RNN-T and CTC Decoder for Online Decoding. -- Attention to Phonetics: A Visually Informed Explanation of Speech Transformers. -- Effects of Training Strategies and the Amount of Speech Data on the Quality of Speech Synthesis. -- Stream-Based Active Learning for Speech Emotion Recognition via Hybrid Data Selection and Continuous Learning. -- Data Alignment and Duration Modelling in VITS. -- Multiword Expressions Resources for Italian: Presenting a Manually Annotated Spoken Corpus. -- Generating High-Quality F0 Embeddings Using the Vector-Quantized Variational Autoencoder. -- Anonymizing Dysarthric Speech: Investigating the Effects of Voice Conversion on Pathological Information Preservation. -- X-vector-based Speaker Diarization Using Bi-LSTM and Interim Voting-driven Post-processing. -- A Paradigm for Interpreting Metrics and Measuring Error Severity in Automatic Speech Recognition. -- Enhancing Speech Emotion Recognition Using Transfer Learning From Speaker Embeddings. -- Dialogue. -- Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets. -- PiCo-VITS: Leveraging Pitch Contours for Fine-grained Emotional Speech Synthesis. -- Improving and Understanding Clarifying Question Generation in Conversational Search. -- Explainable Multimodal Fusion for Dementia Detection From Text and Speech. -- Robust Classification of Parkinson's Speech: an Approximation to a Scenario With Non-controlled Acoustic Conditions. -- Leveraging Conceptual Similarities to Enhance Modeling of Factors Affecting Adolescents' Well-Being. -- Joint-Average Mean and Variance Feature Matching (JAMVFM) Semi-supervised GAN with Additional-Objective Training Function for Intent Detection. -- Capturing Task-Related Information for Text-Based Grasp Classification Using Fine-Tuned Embeddings. -- StepDP: A Step Towards Expressive and Pervasive Dialogue Platforms. -- Automatic Classification of Parkinson's Disease Using Wav2vec Embeddings at Phoneme, Syllable, and Word Levels.
Contained By:	Springer Nature eBook
Subject:	Natural language processing (Computer science) - Congresses. -
Online resource:	https://doi.org/10.1007/978-3-031-70566-3
ISBN:	9783031705663

Text, speech, and dialogue = 27th International Conference, TSD 2024, Brno, Czech Republic, September 9-13, 2024 : proceedingss.. Part II /
Text, speech, and dialogue27th International Conference, TSD 2024, Brno, Czech Republic, September 9-13, 2024 : proceedingss.Part II /[electronic resource] :TSD 2024edited by Elmar Nöth, Aleš Horák, Petr Sojka. - Cham :Springer Nature Switzerland :2024. - xvii, 326 p. :ill. (some col.), digital ;24 cm. - Lecture notes in computer science,150490302-9743 ;. - Lecture notes in computer science ;15049..

Speech. -- Retrieval Augmented Spoken Language Generation for Transport Domain. -- Adapting Audiovisual Speech Synthesis to Estonian. -- Dysphonia Diagnosis Using Self-Supervised Speech Models in Mono- and Cross-Lingual Settings. -- Sentences vs Phrases in Neural Speech Synthesis. -- Zero-Shot vs. Few-Shot Multi-Speaker TTS Using Pre-trained Czech SpeechT5 Model. -- Deep Speaker Embeddings for Speaker Verification of Children. -- Improved Alignment for Score Combination of RNN-T and CTC Decoder for Online Decoding. -- Attention to Phonetics: A Visually Informed Explanation of Speech Transformers. -- Effects of Training Strategies and the Amount of Speech Data on the Quality of Speech Synthesis. -- Stream-Based Active Learning for Speech Emotion Recognition via Hybrid Data Selection and Continuous Learning. -- Data Alignment and Duration Modelling in VITS. -- Multiword Expressions Resources for Italian: Presenting a Manually Annotated Spoken Corpus. -- Generating High-Quality F0 Embeddings Using the Vector-Quantized Variational Autoencoder. -- Anonymizing Dysarthric Speech: Investigating the Effects of Voice Conversion on Pathological Information Preservation. -- X-vector-based Speaker Diarization Using Bi-LSTM and Interim Voting-driven Post-processing. -- A Paradigm for Interpreting Metrics and Measuring Error Severity in Automatic Speech Recognition. -- Enhancing Speech Emotion Recognition Using Transfer Learning From Speaker Embeddings. -- Dialogue. -- Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets. -- PiCo-VITS: Leveraging Pitch Contours for Fine-grained Emotional Speech Synthesis. -- Improving and Understanding Clarifying Question Generation in Conversational Search. -- Explainable Multimodal Fusion for Dementia Detection From Text and Speech. -- Robust Classification of Parkinson's Speech: an Approximation to a Scenario With Non-controlled Acoustic Conditions. -- Leveraging Conceptual Similarities to Enhance Modeling of Factors Affecting Adolescents' Well-Being. -- Joint-Average Mean and Variance Feature Matching (JAMVFM) Semi-supervised GAN with Additional-Objective Training Function for Intent Detection. -- Capturing Task-Related Information for Text-Based Grasp Classification Using Fine-Tuned Embeddings. -- StepDP: A Step Towards Expressive and Pervasive Dialogue Platforms. -- Automatic Classification of Parkinson's Disease Using Wav2vec Embeddings at Phoneme, Syllable, and Word Levels.

The two-volume set LNAI 15048 and 15049 constitutes the refereed proceedings of the 27th International Conference on Text, Speech, and Dialogue, TSD 2024, held in Brno, Czech Republic, during September 9-13, 2024. The 50 revised full papers presented in these deadline proceedings were carefully reviewed and selected from 103 submissions. The papers are organized in the following topical sections: Part I: Text Part II: Speech, Dialogue.

ISBN: 9783031705663

Standard No.: 10.1007/978-3-031-70566-3doiSubjects--Topical Terms:

752585
Natural language processing (Computer science)
--Congresses.

LC Class. No.: QA76.9.N38 / T73 2024

Dewey Class. No.: 006.35

Text, speech, and dialogue = 27th International Conference, TSD 2024, Brno, Czech Republic, September 9-13, 2024 : proceedingss.. Part II /
LDR:04204nmm a22003735a 4500 001 2388029
003 DE-He213
005 20240901130253.0
006 m d
007 cr nn 008maaau
008 250916s2024 sz s 0 eng d
020 $a 9783031705663 $q (electronic bk.)
020 $a 9783031705656 $q (paper)
024 7 $a 10.1007/978-3-031-70566-3 $2 doi
035 $a 978-3-031-70566-3
040 $a GP $c GP
041 0 $a eng
050 4 $a QA76.9.N38 $b T73 2024
072 7 $a UY $2 bicssc
072 7 $a COM000000 $2 bisacsh
072 7 $a UY $2 thema
082 0 4 $a 006.35 $2 23
090 $a QA76.9.N38 $b T881 2024
111 2 $a TSD (Conference) $n (27th : $d 2024 : $c Brno, Czech Republic) $3 3752849
245 1 0 $a Text, speech, and dialogue $h [electronic resource] : $b 27th International Conference, TSD 2024, Brno, Czech Republic, September 9-13, 2024 : proceedingss. $n Part II / $c edited by Elmar Nöth, Aleš Horák, Petr Sojka.
246 3 $a TSD 2024
260 $a Cham : $b Springer Nature Switzerland : $b Imprint: Springer, $c 2024.
300 $a xvii, 326 p. : $b ill. (some col.), digital ; $c 24 cm.
347 $a text file $b PDF $2 rda
490 1 $a Lecture notes in computer science, $x 0302-9743 ; $v 15049
490 1 $a Lecture notes in artificial intelligence
505 0 $a Speech. -- Retrieval Augmented Spoken Language Generation for Transport Domain. -- Adapting Audiovisual Speech Synthesis to Estonian. -- Dysphonia Diagnosis Using Self-Supervised Speech Models in Mono- and Cross-Lingual Settings. -- Sentences vs Phrases in Neural Speech Synthesis. -- Zero-Shot vs. Few-Shot Multi-Speaker TTS Using Pre-trained Czech SpeechT5 Model. -- Deep Speaker Embeddings for Speaker Verification of Children. -- Improved Alignment for Score Combination of RNN-T and CTC Decoder for Online Decoding. -- Attention to Phonetics: A Visually Informed Explanation of Speech Transformers. -- Effects of Training Strategies and the Amount of Speech Data on the Quality of Speech Synthesis. -- Stream-Based Active Learning for Speech Emotion Recognition via Hybrid Data Selection and Continuous Learning. -- Data Alignment and Duration Modelling in VITS. -- Multiword Expressions Resources for Italian: Presenting a Manually Annotated Spoken Corpus. -- Generating High-Quality F0 Embeddings Using the Vector-Quantized Variational Autoencoder. -- Anonymizing Dysarthric Speech: Investigating the Effects of Voice Conversion on Pathological Information Preservation. -- X-vector-based Speaker Diarization Using Bi-LSTM and Interim Voting-driven Post-processing. -- A Paradigm for Interpreting Metrics and Measuring Error Severity in Automatic Speech Recognition. -- Enhancing Speech Emotion Recognition Using Transfer Learning From Speaker Embeddings. -- Dialogue. -- Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets. -- PiCo-VITS: Leveraging Pitch Contours for Fine-grained Emotional Speech Synthesis. -- Improving and Understanding Clarifying Question Generation in Conversational Search. -- Explainable Multimodal Fusion for Dementia Detection From Text and Speech. -- Robust Classification of Parkinson's Speech: an Approximation to a Scenario With Non-controlled Acoustic Conditions. -- Leveraging Conceptual Similarities to Enhance Modeling of Factors Affecting Adolescents' Well-Being. -- Joint-Average Mean and Variance Feature Matching (JAMVFM) Semi-supervised GAN with Additional-Objective Training Function for Intent Detection. -- Capturing Task-Related Information for Text-Based Grasp Classification Using Fine-Tuned Embeddings. -- StepDP: A Step Towards Expressive and Pervasive Dialogue Platforms. -- Automatic Classification of Parkinson's Disease Using Wav2vec Embeddings at Phoneme, Syllable, and Word Levels.
520 $a The two-volume set LNAI 15048 and 15049 constitutes the refereed proceedings of the 27th International Conference on Text, Speech, and Dialogue, TSD 2024, held in Brno, Czech Republic, during September 9-13, 2024. The 50 revised full papers presented in these deadline proceedings were carefully reviewed and selected from 103 submissions. The papers are organized in the following topical sections: Part I: Text Part II: Speech, Dialogue.
650 0 $a Natural language processing (Computer science) $v Congresses. $3 752585
650 0 $a Speech processing systems $x Congresses. $3 678615
650 0 $a Text processing (Computer science) $v Congresses. $3 880608
650 1 4 $a Computer Science. $3 626642
650 2 4 $a Artificial Intelligence. $3 769149
700 1 $a Nöth, Elmar. $3 3752850
700 1 $a Horák, Aleš. $3 3752851
700 1 $a Sojka, Petr. $3 923664
710 2 $a SpringerLink (Online service) $3 836513
773 0 $t Springer Nature eBook
830 0 $a Lecture notes in computer science ; $v 15049. $3 3752853
830 0 $a Lecture notes in artificial intelligence. $3 3382562
856 4 0 $u https://doi.org/10.1007/978-3-031-70566-3
950 $a Computer Science (SpringerNature-11645)