語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Hybrid models for Chinese unknown wo...
~
Lu, Xiaofei.
FindBook
Google Book
Amazon
博客來
Hybrid models for Chinese unknown word resolution.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Hybrid models for Chinese unknown word resolution./
作者:
Lu, Xiaofei.
面頁冊數:
171 p.
附註:
Adviser: W. Detmar Meurers.
Contained By:
Dissertation Abstracts International67-07A.
標題:
Language, Linguistics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3226424
ISBN:
9780542781551
Hybrid models for Chinese unknown word resolution.
Lu, Xiaofei.
Hybrid models for Chinese unknown word resolution.
- 171 p.
Adviser: W. Detmar Meurers.
Thesis (Ph.D.)--The Ohio State University, 2006.
Word segmentation, part-of-speech (POS) tagging, and sense tagging are important steps in various Chinese natural language processing (CNLP) systems. Unknown words, i.e., words that are not in the dictionary or training data used in a CNLP system, constitute a major challenge for each of these steps. This dissertation is concerned with developing hybrid models that effectively combine statistical, knowledge-based, and machine learning approaches for Chinese unknown word resolution, including the identification, part-of-speech (POS) tagging, and sense tagging of Chinese unknown words. What makes Chinese unknown word resolution hard is the limited information available for predicting the properties of unknown words, and for this reason it is crucial to make optimal use of information that is available. To this end, this research explores two central ideas and aims to achieve two major goals.
ISBN: 9780542781551Subjects--Topical Terms:
1018079
Language, Linguistics.
Hybrid models for Chinese unknown word resolution.
LDR
:03513nam 2200289 a 45
001
967974
005
20110915
008
110915s2006 eng d
020
$a
9780542781551
035
$a
(UMI)AAI3226424
035
$a
AAI3226424
040
$a
UMI
$c
UMI
100
1
$a
Lu, Xiaofei.
$3
1291843
245
1 0
$a
Hybrid models for Chinese unknown word resolution.
300
$a
171 p.
500
$a
Adviser: W. Detmar Meurers.
500
$a
Source: Dissertation Abstracts International, Volume: 67-07, Section: A, page: 2554.
502
$a
Thesis (Ph.D.)--The Ohio State University, 2006.
520
$a
Word segmentation, part-of-speech (POS) tagging, and sense tagging are important steps in various Chinese natural language processing (CNLP) systems. Unknown words, i.e., words that are not in the dictionary or training data used in a CNLP system, constitute a major challenge for each of these steps. This dissertation is concerned with developing hybrid models that effectively combine statistical, knowledge-based, and machine learning approaches for Chinese unknown word resolution, including the identification, part-of-speech (POS) tagging, and sense tagging of Chinese unknown words. What makes Chinese unknown word resolution hard is the limited information available for predicting the properties of unknown words, and for this reason it is crucial to make optimal use of information that is available. To this end, this research explores two central ideas and aims to achieve two major goals.
520
$a
First, the morphological, syntactic, and semantic information of the component characters or morphemes of an unknown word provides useful insights into its structural and semantic properties. The first goal of this work is to develop novel algorithms that capture such insights. To integrate unknown word identification with word segmentation, the notion of character-based tagging is adopted to model the tendency of individual characters to combine with adjacent characters to form words in different contexts. To predict the POS categories of unknown words, morphological rules that encode knowledge about the relationship between the POS categories of unknown words and those of their component morphemes are developed. Finally, to classify unknown words into appropriate semantic categories in a Chinese thesaurus, rules that capture the regularities in the relationship between the semantic categories of unknown words and those of their component morphemes are developed; information-theoretical models are used to compute the associations between individual morphemes and semantic categories for the same purpose.
520
$a
Second, in addition to information about the component characters of an unknown word, information about its type, length, and internal structure as well as the context in which it occurs provides useful insights into its properties, too. Existing approaches to Chinese unknown word resolution tend to use different, but single sources of information and are often effective in handling different subsets of unknown words. The second goal of this research is to identify the relative strengths of novel and existing models and to combine them to achieve optimal use of information and better performance for the task.
590
$a
School code: 0168.
650
4
$a
Language, Linguistics.
$3
1018079
690
$a
0290
710
2 0
$a
The Ohio State University.
$3
718944
773
0
$t
Dissertation Abstracts International
$g
67-07A.
790
$a
0168
790
1 0
$a
Meurers, W. Detmar,
$e
advisor
791
$a
Ph.D.
792
$a
2006
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3226424
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9126628
電子資源
11.線上閱覽_V
電子書
EB W9126628
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入