語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Using Word and Phrase Abbreviation P...
~
Moseley, Nathaniel.
FindBook
Google Book
Amazon
博客來
Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts./
作者:
Moseley, Nathaniel.
面頁冊數:
66 p.
附註:
Source: Masters Abstracts International, Volume: 51-06.
Contained By:
Masters Abstracts International51-06(E).
標題:
Computer Science. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=1538881
ISBN:
9781303130625
Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.
Moseley, Nathaniel.
Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.
- 66 p.
Source: Masters Abstracts International, Volume: 51-06.
Thesis (M.S.)--Rochester Institute of Technology, 2013.
The wealth of texts available publicly online for analysis is ever increasing. Much work in computational linguistics focuses on syntactic, contextual, morphological and phonetic analysis on written documents, vocal recordings, or texts on the internet. Twitter messages present a unique challenge for computational linguistic analysis due to their constrained size. The constraint of 140 characters often prompts users to abbreviate words and phrases. Additionally, as an informal writing medium, messages are not expected to adhere to grammatically or orthographically standard English. As such, Twitter messages are noisy and do not necessarily conform to standard writing conventions of linguistic corpora, often requiring special pre-processing before advanced analysis can be done.
ISBN: 9781303130625Subjects--Topical Terms:
626642
Computer Science.
Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.
LDR
:03902nam a2200349 4500
001
1966767
005
20141112075106.5
008
150210s2013 ||||||||||||||||| ||eng d
020
$a
9781303130625
035
$a
(MiAaPQ)AAI1538881
035
$a
AAI1538881
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Moseley, Nathaniel.
$3
2103634
245
1 0
$a
Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.
300
$a
66 p.
500
$a
Source: Masters Abstracts International, Volume: 51-06.
500
$a
Includes supplementary digital materials.
500
$a
Adviser: Manjeet Rege.
502
$a
Thesis (M.S.)--Rochester Institute of Technology, 2013.
520
$a
The wealth of texts available publicly online for analysis is ever increasing. Much work in computational linguistics focuses on syntactic, contextual, morphological and phonetic analysis on written documents, vocal recordings, or texts on the internet. Twitter messages present a unique challenge for computational linguistic analysis due to their constrained size. The constraint of 140 characters often prompts users to abbreviate words and phrases. Additionally, as an informal writing medium, messages are not expected to adhere to grammatically or orthographically standard English. As such, Twitter messages are noisy and do not necessarily conform to standard writing conventions of linguistic corpora, often requiring special pre-processing before advanced analysis can be done.
520
$a
In the area of computational linguistics, there is an interest in determining latent attributes of an author. Attributes such as author gender can be determined with some amount of success from many sources, using various methods, such as analysis of shallow linguistic patterns or topic. Author age is more difficult to determine, but previous research has been somewhat successful at classifying age as a binary (e.g. over or under 30), ternary, or even as a continuous variable using various techniques.
520
$a
Twitter messages present a difficult problem for latent user attribute analysis, due to the pre-processing necessary for many computational linguistics analysis tasks. An added logistical challenge is that very few latent attributes are explicitly defined by users on Twitter. Twitter messages are a part of an enormous data set, but the data set must be independently annotated for latent writer attributes not defined through the Twitter API before any classification on such attributes can be done. The actual classification problem is another particular challenge due to restrictions on tweet length.
520
$a
Previous work has shown that word and phrase abbreviation patterns used on Twitter can be indicative of some latent user attributes, such as geographic region or the Twitter client (iPhone, Android, Twitter website, etc. ) used to make posts. Language change has generally been posited as being driven by women. This study explores if there there are age-related patterns or change in those patterns over time evident in Twitter posts from a variety of English authors.
520
$a
This work presents a growable data set annotated by Twitter users themselves for age and other useful attributes. The study also presents an extension of prior work on Twitter abbreviation patterns which shows that word and phrase abbreviation patterns can be used toward determining user age. Notable results include classification accuracy of up to 83%, which was 63% above relative majority class baseline (ZeroR in Weka) when classifying user ages into 6 equally sized age bins using a multilayer perceptron network classifier.
590
$a
School code: 0465.
650
4
$a
Computer Science.
$3
626642
650
4
$a
Language, Linguistics.
$3
1018079
650
4
$a
Multimedia Communications.
$3
1057801
690
$a
0984
690
$a
0290
690
$a
0558
710
2
$a
Rochester Institute of Technology.
$b
Computer Science.
$3
1044045
773
0
$t
Masters Abstracts International
$g
51-06(E).
790
$a
0465
791
$a
M.S.
792
$a
2013
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=1538881
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9261773
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入