語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Essays on the Sociological Analysis of Segregation and Natural Language.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Essays on the Sociological Analysis of Segregation and Natural Language./
作者:
Nanni, Antonio.
面頁冊數:
1 online resource (281 pages)
附註:
Source: Dissertations Abstracts International, Volume: 83-12, Section: B.
Contained By:
Dissertations Abstracts International83-12B.
標題:
Sociology. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29169938click for full text (PQDT)
ISBN:
9798438794790
Essays on the Sociological Analysis of Segregation and Natural Language.
Nanni, Antonio.
Essays on the Sociological Analysis of Segregation and Natural Language.
- 1 online resource (281 pages)
Source: Dissertations Abstracts International, Volume: 83-12, Section: B.
Thesis (Ph.D.)--Northwestern University, 2022.
Includes bibliographical references
This dissertation contributes to the theory of segregation and methodologies to measure it. The first two chapters focus on the traditional problem of quantifying segregation in traditional survey data through segregation indices. Segregation indices describe the segregation of an environment with one number-usually from 0 to 1. The last chapter focuses on a new form of data: unstructured textual data. It analyzes the issue of extracting stereotypical cultural schema from this kind of data using the increasingly-popular word embedding models.In the first chapter, we show that segregation indices calculated from samples are biased and unreliable, especially in small samples. Often, researchers use segregation indices on samples to estimate the segregation in a population. Therefore, statistical inference on segregation indices is necessary, but methods to conduct this kind of inference are scarcely available and not generally used. To obviate the problem, the chapter formulates two new general techniques based on non-parametric Bayesian models. The new techniques are applicable to any segregation index or function of segregation indices. To demonstrate their capability, the chapter tests the Bayesian techniques on the D and Theil indices, and the decomposition of the Theil index. Extensive Monte Carlo simulations compare the performances of the new Bayesian techniques with the current standard practice and currently-best available alternative, a bootstrap-based estimator. In all of the simulations, the new techniques provide more reliable inferences than previously achieved. Particularly, the Bayesian techniques appear remarkably more accurate on small samples and in the production of confidence intervals. We recommend using the new Bayesian techniques to conduct inference, especially in smaller samples.The second chapter analyzes the issue of comparing segregation indices. Often, researchers use segregation indices to compare segregation in different environments. However, it is very difficult to interpret the differences in segregation indices between two environments, since traditional indices mixes different phenomena. The chapter formalizes the problem of interpreting change in segregation and builds a new family of indices that is interpretable from this perspective. One of its member, Q, is both interpretable and strongly decomposable, as is the Theil index. To formulate of Q, the paper also provides new results about margin-free indices (Charles and Grusky, 1995). It formulates the only way to build margin-free indices and provides a new solution to the zero problem afflicting these indices. As a result, the chapter also formulates the index Q*, which is the first strongly-decomposable margin-free index.The third chapter analyzes the use of word embedding models in the social sciences. Word embedding models represent each word from a textual corpus as a vector in a multi-dimensional space. They are increasingly popular in the social sciences for their ability to capture cultural schemas from readily-available textual corpora. Sociologists have used word embedding models to study a variety of different issues: from the association of obesity to gender, to the evolution of the concept of social class. A growing literature in computer science and linguistics examines how words become vectors, but fewer works analyze how to extract meaning from such vectors in order to draw social scientific conclusions. The chapter focuses on the theoretical and methodological assumptions governing the latter process. It shows that previous social scientific research relies on a simple model of meaning in word-vectors. Subsequently, it formulates a more general model linking meaning and vectors-the "simple algebra of meaning''. The simple algebra of meaning subsumes previous methodologies and paves the way for methodological innovation in the social scientific use of word embedding models. Finally, the chapter draws upon the new model to expand the current uses of word embedding models. It shows how to 1. accommodate non-binary oppositions, 2. analyze entire documents (as opposed to single words), 3. consider more than one concept at the same time, 4. decompose the meaning of documents into a function of the meaning of their words. As an example, the chapter tests the new methodologies on a corpus of 30,228 abstracts about climate change and estimates the Lovecraftian aura of words from publicly-available word embedding.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023
Mode of access: World Wide Web
ISBN: 9798438794790Subjects--Topical Terms:
516174
Sociology.
Subjects--Index Terms:
SegregationIndex Terms--Genre/Form:
542853
Electronic books.
Essays on the Sociological Analysis of Segregation and Natural Language.
LDR
:05810nmm a2200373K 4500
001
2357556
005
20230725053513.5
006
m o d
007
cr mn ---uuuuu
008
241011s2022 xx obm 000 0 eng d
020
$a
9798438794790
035
$a
(MiAaPQ)AAI29169938
035
$a
AAI29169938
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Nanni, Antonio.
$3
672023
245
1 0
$a
Essays on the Sociological Analysis of Segregation and Natural Language.
264
0
$c
2022
300
$a
1 online resource (281 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 83-12, Section: B.
500
$a
Advisor: Quillian, Lincoln.
502
$a
Thesis (Ph.D.)--Northwestern University, 2022.
504
$a
Includes bibliographical references
520
$a
This dissertation contributes to the theory of segregation and methodologies to measure it. The first two chapters focus on the traditional problem of quantifying segregation in traditional survey data through segregation indices. Segregation indices describe the segregation of an environment with one number-usually from 0 to 1. The last chapter focuses on a new form of data: unstructured textual data. It analyzes the issue of extracting stereotypical cultural schema from this kind of data using the increasingly-popular word embedding models.In the first chapter, we show that segregation indices calculated from samples are biased and unreliable, especially in small samples. Often, researchers use segregation indices on samples to estimate the segregation in a population. Therefore, statistical inference on segregation indices is necessary, but methods to conduct this kind of inference are scarcely available and not generally used. To obviate the problem, the chapter formulates two new general techniques based on non-parametric Bayesian models. The new techniques are applicable to any segregation index or function of segregation indices. To demonstrate their capability, the chapter tests the Bayesian techniques on the D and Theil indices, and the decomposition of the Theil index. Extensive Monte Carlo simulations compare the performances of the new Bayesian techniques with the current standard practice and currently-best available alternative, a bootstrap-based estimator. In all of the simulations, the new techniques provide more reliable inferences than previously achieved. Particularly, the Bayesian techniques appear remarkably more accurate on small samples and in the production of confidence intervals. We recommend using the new Bayesian techniques to conduct inference, especially in smaller samples.The second chapter analyzes the issue of comparing segregation indices. Often, researchers use segregation indices to compare segregation in different environments. However, it is very difficult to interpret the differences in segregation indices between two environments, since traditional indices mixes different phenomena. The chapter formalizes the problem of interpreting change in segregation and builds a new family of indices that is interpretable from this perspective. One of its member, Q, is both interpretable and strongly decomposable, as is the Theil index. To formulate of Q, the paper also provides new results about margin-free indices (Charles and Grusky, 1995). It formulates the only way to build margin-free indices and provides a new solution to the zero problem afflicting these indices. As a result, the chapter also formulates the index Q*, which is the first strongly-decomposable margin-free index.The third chapter analyzes the use of word embedding models in the social sciences. Word embedding models represent each word from a textual corpus as a vector in a multi-dimensional space. They are increasingly popular in the social sciences for their ability to capture cultural schemas from readily-available textual corpora. Sociologists have used word embedding models to study a variety of different issues: from the association of obesity to gender, to the evolution of the concept of social class. A growing literature in computer science and linguistics examines how words become vectors, but fewer works analyze how to extract meaning from such vectors in order to draw social scientific conclusions. The chapter focuses on the theoretical and methodological assumptions governing the latter process. It shows that previous social scientific research relies on a simple model of meaning in word-vectors. Subsequently, it formulates a more general model linking meaning and vectors-the "simple algebra of meaning''. The simple algebra of meaning subsumes previous methodologies and paves the way for methodological innovation in the social scientific use of word embedding models. Finally, the chapter draws upon the new model to expand the current uses of word embedding models. It shows how to 1. accommodate non-binary oppositions, 2. analyze entire documents (as opposed to single words), 3. consider more than one concept at the same time, 4. decompose the meaning of documents into a function of the meaning of their words. As an example, the chapter tests the new methodologies on a corpus of 30,228 abstracts about climate change and estimates the Lovecraftian aura of words from publicly-available word embedding.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2023
538
$a
Mode of access: World Wide Web
650
4
$a
Sociology.
$3
516174
650
4
$a
Statistics.
$3
517247
653
$a
Segregation
653
$a
Segregation index
653
$a
Statistical inference
653
$a
Word embedding
655
7
$a
Electronic books.
$2
lcsh
$3
542853
690
$a
0626
690
$a
0463
710
2
$a
ProQuest Information and Learning Co.
$3
783688
710
2
$a
Northwestern University.
$b
Sociology.
$3
1020890
773
0
$t
Dissertations Abstracts International
$g
83-12B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29169938
$z
click for full text (PQDT)
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9479912
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入