Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Error detection and correction in an...
~
Dickinson, Markus.
Linked to FindBook
Google Book
Amazon
博客來
Error detection and correction in annotated corpora.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Error detection and correction in annotated corpora./
Author:
Dickinson, Markus.
Description:
285 p.
Notes:
Source: Dissertation Abstracts International, Volume: 66-06, Section: A, page: 2191.
Contained By:
Dissertation Abstracts International66-06A.
Subject:
Language, Linguistics. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3180126
ISBN:
0542203502
Error detection and correction in annotated corpora.
Dickinson, Markus.
Error detection and correction in annotated corpora.
- 285 p.
Source: Dissertation Abstracts International, Volume: 66-06, Section: A, page: 2191.
Thesis (Ph.D.)--The Ohio State University, 2005.
Building on work showing the harmfulness of annotation errors for both the training and evaluation of natural language processing technologies, this thesis develops a method for detecting and correcting errors in corpora with linguistic annotation. The so-called variation n-gram method relies on the recurrence of identical strings with varying annotation to find erroneous mark-up.
ISBN: 0542203502Subjects--Topical Terms:
1018079
Language, Linguistics.
Error detection and correction in annotated corpora.
LDR
:02563nmm 2200301 4500
001
1816409
005
20060717095836.5
008
130610s2005 eng d
020
$a
0542203502
035
$a
(UnM)AAI3180126
035
$a
AAI3180126
040
$a
UnM
$c
UnM
100
1
$a
Dickinson, Markus.
$3
1630131
245
1 0
$a
Error detection and correction in annotated corpora.
300
$a
285 p.
500
$a
Source: Dissertation Abstracts International, Volume: 66-06, Section: A, page: 2191.
500
$a
Adviser: Walt Detmar Meurers.
502
$a
Thesis (Ph.D.)--The Ohio State University, 2005.
520
$a
Building on work showing the harmfulness of annotation errors for both the training and evaluation of natural language processing technologies, this thesis develops a method for detecting and correcting errors in corpora with linguistic annotation. The so-called variation n-gram method relies on the recurrence of identical strings with varying annotation to find erroneous mark-up.
520
$a
We show that the method is applicable for varying complexities of annotation. The method is most readily applied to positional annotation, such as part-of-speech annotation, but can be extended to structural annotation, both for tree structures---as with syntactic annotation---and for graph structures---as with syntactic annotation allowing discontinuous constituents, or crossing branches.
520
$a
Furthermore, we demonstrate that the notion of variation for detecting errors is a powerful one, by searching for grammar rules in a treebank which have the same daughters but different mothers. We also show that such errors impact the effectiveness of a grammar induction algorithm and subsequent parsing.
520
$a
After detecting errors in the different corpora, we turn to correcting such errors, through the use of more general classification techniques. Our results indicate that the particular classification algorithm is less important than understanding the nature of the errors and altering the classifiers to deal with these errors. With such alterations, we can automatically correct errors with 85% accuracy. By sorting the errors, we can relegate over 20% of them into an automatically correctable class and speed up the re-annotation process by effectively categorizing the others.
590
$a
School code: 0168.
650
4
$a
Language, Linguistics.
$3
1018079
690
$a
0290
710
2 0
$a
The Ohio State University.
$3
718944
773
0
$t
Dissertation Abstracts International
$g
66-06A.
790
1 0
$a
Meurers, Walt Detmar,
$e
advisor
790
$a
0168
791
$a
Ph.D.
792
$a
2005
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3180126
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9207272
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login