語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Spectral Methods for Social Media Data Analysis.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Spectral Methods for Social Media Data Analysis./
作者:
Chen, Fan.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:
288 p.
附註:
Source: Dissertations Abstracts International, Volume: 82-12, Section: B.
Contained By:
Dissertations Abstracts International82-12B.
標題:
Statistics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28540694
ISBN:
9798505542774
Spectral Methods for Social Media Data Analysis.
Chen, Fan.
Spectral Methods for Social Media Data Analysis.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 288 p.
Source: Dissertations Abstracts International, Volume: 82-12, Section: B.
Thesis (Ph.D.)--The University of Wisconsin - Madison, 2021.
This item must not be sold to any third party vendors.
Online social media have come to record an ever-increasing share of human communication and social interaction, making available vast quantities of data. This enables studies of individuals and society at large while presenting numerous computational and statistical challenges throughout the process. This dissertation introduces a set of spectral methods for social media data analysis and provides their statistical justifications and theoretical performance guarantees. These methods are motivated by the following questions that emerged in social media data analysis:Q1:How to obtain a targeted sample of accounts?Q2:How to select the number of communities?Q3:How to find the underlying community structure?Q4:How to collect some topic-specific documents?For Q1, we study personalized PageRank (PPR), a popular technique that samples a small community from a massive network. Under the degree-corrected stochastic block model, we provide a simple and interpretable form for the PPR vector, highlighting its biases towards high degree nodes outside of the target block. We examine a simple adjustment based on node degrees and establish the consistency results for PPR clustering.For Q2, we introduce a notion of cross-validated eigenvalues. Under a large class of random graph models, we provide a simple estimation procedure, a central limit theorem that gives a p-value for the statistical significance of each sample eigenvector, and a proof of consistency for estimating the number of communities in a network.For Q3, we propose a new basis for sparse principal component analysis which can be applied on a graph adjacency matrix to identify the community structure. We provide evidence showing that for the same level of sparsity, the proposed method is more stable and can explain more variance compared to alternative methods.For Q4, we study a local word embedding technique that measures both the frequency and exclusivity of words to a targeted topic. Under the popular latent Dirichlet allocation, we provide the statistical consistency for this embedding.Finally, we introduce the "murmuration" framework which integrates the statistical methods and tracks the public political opinion expressions on Twitter, demonstrating a new way of imagining and measuring opinions on social media.
ISBN: 9798505542774Subjects--Topical Terms:
517247
Statistics.
Subjects--Index Terms:
Community detection
Spectral Methods for Social Media Data Analysis.
LDR
:03521nmm a2200385 4500
001
2342211
005
20211209144711.5
008
241004s2021 ||||||||||||||||| ||eng d
020
$a
9798505542774
035
$a
(MiAaPQ)AAI28540694
035
$a
AAI28540694
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Chen, Fan.
$3
3563885
245
1 0
$a
Spectral Methods for Social Media Data Analysis.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
288 p.
500
$a
Source: Dissertations Abstracts International, Volume: 82-12, Section: B.
500
$a
Advisor: Rohe, Karl;Keles, Sunduz.
502
$a
Thesis (Ph.D.)--The University of Wisconsin - Madison, 2021.
506
$a
This item must not be sold to any third party vendors.
520
$a
Online social media have come to record an ever-increasing share of human communication and social interaction, making available vast quantities of data. This enables studies of individuals and society at large while presenting numerous computational and statistical challenges throughout the process. This dissertation introduces a set of spectral methods for social media data analysis and provides their statistical justifications and theoretical performance guarantees. These methods are motivated by the following questions that emerged in social media data analysis:Q1:How to obtain a targeted sample of accounts?Q2:How to select the number of communities?Q3:How to find the underlying community structure?Q4:How to collect some topic-specific documents?For Q1, we study personalized PageRank (PPR), a popular technique that samples a small community from a massive network. Under the degree-corrected stochastic block model, we provide a simple and interpretable form for the PPR vector, highlighting its biases towards high degree nodes outside of the target block. We examine a simple adjustment based on node degrees and establish the consistency results for PPR clustering.For Q2, we introduce a notion of cross-validated eigenvalues. Under a large class of random graph models, we provide a simple estimation procedure, a central limit theorem that gives a p-value for the statistical significance of each sample eigenvector, and a proof of consistency for estimating the number of communities in a network.For Q3, we propose a new basis for sparse principal component analysis which can be applied on a graph adjacency matrix to identify the community structure. We provide evidence showing that for the same level of sparsity, the proposed method is more stable and can explain more variance compared to alternative methods.For Q4, we study a local word embedding technique that measures both the frequency and exclusivity of words to a targeted topic. Under the popular latent Dirichlet allocation, we provide the statistical consistency for this embedding.Finally, we introduce the "murmuration" framework which integrates the statistical methods and tracks the public political opinion expressions on Twitter, demonstrating a new way of imagining and measuring opinions on social media.
590
$a
School code: 0262.
650
4
$a
Statistics.
$3
517247
650
4
$a
Mathematics.
$3
515831
650
4
$a
Mass communications.
$3
3422380
653
$a
Community detection
653
$a
Graph dimension
653
$a
Network analysis
653
$a
Personalized PageRank
653
$a
Sparse principal component analysis
653
$a
Twitter sampling
690
$a
0463
690
$a
0708
690
$a
0405
710
2
$a
The University of Wisconsin - Madison.
$b
Statistics.
$3
2101047
773
0
$t
Dissertations Abstracts International
$g
82-12B.
790
$a
0262
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28540694
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9464649
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入