東華大學圖書館 |

Language: English

Help

回圖書館首頁

手機版館藏查詢

Back

Switch To: Labeled | MARC Mode | ISBD

Using Word and Phrase Abbreviation P...

Moseley, Nathaniel.

Linked to FindBook

Google Book

Amazon

博客來

Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.

Record Type:	Language materials, printed : Monograph/item
Title/Author:	Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts./
Author:	Moseley, Nathaniel.
Description:	66 p.
Notes:	Source: Masters Abstracts International, Volume: 51-06.
Contained By:	Masters Abstracts International51-06(E).
Subject:	Computer Science. -
Online resource:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=1538881
ISBN:	9781303130625

Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.
Moseley, Nathaniel.

Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts. - 66 p.

Source: Masters Abstracts International, Volume: 51-06.

Thesis (M.S.)--Rochester Institute of Technology, 2013.

The wealth of texts available publicly online for analysis is ever increasing. Much work in computational linguistics focuses on syntactic, contextual, morphological and phonetic analysis on written documents, vocal recordings, or texts on the internet. Twitter messages present a unique challenge for computational linguistic analysis due to their constrained size. The constraint of 140 characters often prompts users to abbreviate words and phrases. Additionally, as an informal writing medium, messages are not expected to adhere to grammatically or orthographically standard English. As such, Twitter messages are noisy and do not necessarily conform to standard writing conventions of linguistic corpora, often requiring special pre-processing before advanced analysis can be done.

ISBN: 9781303130625Subjects--Topical Terms:

626642
Computer Science.

Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.
LDR:03902nam a2200349 4500 001 1966767
005 20141112075106.5
008 150210s2013 ||||||||||||||||| ||eng d
020 $a 9781303130625
035 $a (MiAaPQ)AAI1538881
035 $a AAI1538881
040 $a MiAaPQ $c MiAaPQ
100 1 $a Moseley, Nathaniel. $3 2103634
245 1 0 $a Using Word and Phrase Abbreviation Patterns to Extract Age From Twitter Microtexts.
300 $a 66 p.
500 $a Source: Masters Abstracts International, Volume: 51-06.
500 $a Includes supplementary digital materials.
500 $a Adviser: Manjeet Rege.
502 $a Thesis (M.S.)--Rochester Institute of Technology, 2013.
520 $a The wealth of texts available publicly online for analysis is ever increasing. Much work in computational linguistics focuses on syntactic, contextual, morphological and phonetic analysis on written documents, vocal recordings, or texts on the internet. Twitter messages present a unique challenge for computational linguistic analysis due to their constrained size. The constraint of 140 characters often prompts users to abbreviate words and phrases. Additionally, as an informal writing medium, messages are not expected to adhere to grammatically or orthographically standard English. As such, Twitter messages are noisy and do not necessarily conform to standard writing conventions of linguistic corpora, often requiring special pre-processing before advanced analysis can be done.
520 $a In the area of computational linguistics, there is an interest in determining latent attributes of an author. Attributes such as author gender can be determined with some amount of success from many sources, using various methods, such as analysis of shallow linguistic patterns or topic. Author age is more difficult to determine, but previous research has been somewhat successful at classifying age as a binary (e.g. over or under 30), ternary, or even as a continuous variable using various techniques.
520 $a Twitter messages present a difficult problem for latent user attribute analysis, due to the pre-processing necessary for many computational linguistics analysis tasks. An added logistical challenge is that very few latent attributes are explicitly defined by users on Twitter. Twitter messages are a part of an enormous data set, but the data set must be independently annotated for latent writer attributes not defined through the Twitter API before any classification on such attributes can be done. The actual classification problem is another particular challenge due to restrictions on tweet length.
520 $a Previous work has shown that word and phrase abbreviation patterns used on Twitter can be indicative of some latent user attributes, such as geographic region or the Twitter client (iPhone, Android, Twitter website, etc. ) used to make posts. Language change has generally been posited as being driven by women. This study explores if there there are age-related patterns or change in those patterns over time evident in Twitter posts from a variety of English authors.
520 $a This work presents a growable data set annotated by Twitter users themselves for age and other useful attributes. The study also presents an extension of prior work on Twitter abbreviation patterns which shows that word and phrase abbreviation patterns can be used toward determining user age. Notable results include classification accuracy of up to 83%, which was 63% above relative majority class baseline (ZeroR in Weka) when classifying user ages into 6 equally sized age bins using a multilayer perceptron network classifier.
590 $a School code: 0465.
650 4 $a Computer Science. $3 626642
650 4 $a Language, Linguistics. $3 1018079
650 4 $a Multimedia Communications. $3 1057801
690 $a 0984
690 $a 0290
690 $a 0558
710 2 $a Rochester Institute of Technology. $b Computer Science. $3 1044045
773 0 $t Masters Abstracts International $g 51-06(E).
790 $a 0465
791 $a M.S.
792 $a 2013
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=1538881