東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Investigating Prompt Difficulty in a...

Cox, Troy L.

FindBook

Google Book

Amazon

博客來

Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment.

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment./
作者:	Cox, Troy L.
面頁冊數:	114 p.
附註:	Source: Dissertation Abstracts International, Volume: 74-09(E), Section: A.
Contained By:	Dissertation Abstracts International74-09A(E).
標題:	Education, Tests and Measurements. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3562082
ISBN:	9781303094965

Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment.
Cox, Troy L.

Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment. - 114 p.

Source: Dissertation Abstracts International, Volume: 74-09(E), Section: A.

Thesis (Ph.D.)--Brigham Young University, 2013.

Speaking assessments for second language learners have traditionally been expensive to administer because of the cost of rating the speech samples. To reduce the cost, many researchers are investigating the potential of using automatic speech recognition (ASR) as a means to score examinee responses to open-ended prompts. This study examined the potential of using ASR timing fluency features to predict speech ratings and the effect of prompt difficulty in that process. A speaking test with ten prompts representing five different intended difficulty levels was administered to 201 subjects. The speech samples obtained were then (a) rated by human raters holistically, (b) rated by human raters analytically at the item level, and (c) scored automatically using PRAAT to calculate ten different ASR timing fluency features. The ratings and scores of the speech samples were analyzed with Rasch measurement to evaluate the functionality of the scales and the separation reliability of the examinees, raters, and items.

ISBN: 9781303094965Subjects--Topical Terms:

1017589
Education, Tests and Measurements.

Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment.
LDR:03454nam a2200313 4500 001 1964236
005 20141015113817.5
008 150210s2013 ||||||||||||||||| ||eng d
020 $a 9781303094965
035 $a (MiAaPQ)AAI3562082
035 $a AAI3562082
040 $a MiAaPQ $c MiAaPQ
100 1 $a Cox, Troy L. $3 2100649
245 1 0 $a Investigating Prompt Difficulty in an Automatically Scored Speaking Performance Assessment.
300 $a 114 p.
500 $a Source: Dissertation Abstracts International, Volume: 74-09(E), Section: A.
500 $a Adviser: Randall Spencer Davies.
502 $a Thesis (Ph.D.)--Brigham Young University, 2013.
520 $a Speaking assessments for second language learners have traditionally been expensive to administer because of the cost of rating the speech samples. To reduce the cost, many researchers are investigating the potential of using automatic speech recognition (ASR) as a means to score examinee responses to open-ended prompts. This study examined the potential of using ASR timing fluency features to predict speech ratings and the effect of prompt difficulty in that process. A speaking test with ten prompts representing five different intended difficulty levels was administered to 201 subjects. The speech samples obtained were then (a) rated by human raters holistically, (b) rated by human raters analytically at the item level, and (c) scored automatically using PRAAT to calculate ten different ASR timing fluency features. The ratings and scores of the speech samples were analyzed with Rasch measurement to evaluate the functionality of the scales and the separation reliability of the examinees, raters, and items.
520 $a There were three ASR timed fluency features that best predicted human speaking ratings: speech rate, mean syllables per run, and number of silent pauses. However, only 31% of the score variance was predicted by these features. The significance in this finding is that those fluency features alone likely provide insufficient information to predict human rated speaking ability accurately. Furthermore, neither the item difficulties calculated by the ASR nor those rated analytically by the human raters aligned with the intended item difficulty levels. The misalignment of the human raters with the intended difficulties led to a further analysis that found that it was problematic for raters to use a holistic scale at the item level. However, modifying the holistic scale to a scale that examined if the response to the prompt was at-level resulted in a significant correlation (r = .98, p < .01) between the item difficulties calculated analytically by the human raters and the intended difficulties. This result supports the hypothesis that item prompts are important when it comes to obtaining quality speech samples. As test developers seek to use ASR to score speaking assessments, caution is warranted to ensure that score differences are due to examinee ability and not the prompt composition of the test.
520 $a Keywords: Automatic Speech Recognition, second language oral proficiency, language testing and assessment, English as a second language tests, speech signal processing.
590 $a School code: 0022.
650 4 $a Education, Tests and Measurements. $3 1017589
650 4 $a Education, English as a Second Language. $3 1030294
650 4 $a Education, Technology of. $3 1018012
690 $a 0288
690 $a 0441
690 $a 0710
710 2 $a Brigham Young University. $b Instructional Psychology and Technology. $3 1682164
773 0 $t Dissertation Abstracts International $g 74-09A(E).
790 $a 0022
791 $a Ph.D.
792 $a 2013
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3562082