語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Data-Driven Studies on Social Networ...
~
Horawalavithana, Yasanka Sameera.
FindBook
Google Book
Amazon
博客來
Data-Driven Studies on Social Networks: Privacy and Simulation.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Data-Driven Studies on Social Networks: Privacy and Simulation./
作者:
Horawalavithana, Yasanka Sameera.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:
180 p.
附註:
Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Contained By:
Dissertations Abstracts International83-03B.
標題:
Computer science. -
電子資源:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28643943
ISBN:
9798535533827
Data-Driven Studies on Social Networks: Privacy and Simulation.
Horawalavithana, Yasanka Sameera.
Data-Driven Studies on Social Networks: Privacy and Simulation.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 180 p.
Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Thesis (Ph.D.)--University of South Florida, 2021.
This item must not be sold to any third party vendors.
Social media datasets are fundamental to understanding a variety of phenomena, such as epidemics, adoption of behavior, crowd management, and political uprisings. At the same time, many such datasets capturing computer-mediated social interactions are recorded nowadays by individual researchers or by organizations. However, while the need for real social graphs and the supply of such datasets are well established, the flow of data from data owners to researchers is significantly hampered by privacy risks: even when humans' identities are removed, or data is anonymized to some extent, studies have proven repeatedly that re-identifying anonymized user identities (i.e., de-anonymization) is doable with high success rate.A main research challenge is to develop a principled understanding of how to measure the effectiveness of an anonymization scheme and thus, conversely, the likely success of a de-anonymization attack. This dissertation develops methods to understand what makes some graph datasets more resilient to de-anonymization attacks. We propose a data-driven framework to 1) quantify the vulnerability of a graph to a re-identification attack; 2) quantitatively identify which graph structural properties contribute most to graph vulnerability; and 3) propose guidelines to develop new methodologies related to graph anonymization, de-anonymization and graph vulnerability quantification. We show the usefulness of this framework on a large set of synthetically generated graphs with con- trolled propertied inspired from a set of real social networks. Thus, we provide an unified framework to analyze the privacy/utility trade-off imposed on any family of social graphs.We extend this data-driven framework for networks with node attributes. Using this improved framework, we quantify how much better a node re-identification attack performs when the node attributes are included in the attack compared to when there is no node attribute information available to the attacker. We quantify the privacy impact of node attributes under an attribute attachment model biased towards homophily, and analyze the interplay between graph structures and attribute information. Our results show that binary node attributes increase the chance of revealing node identity independent of their placements in the network. Further, we show that other network properties independent of the degree distribution put node privacy at risk. This improves the current understanding of graph privacy, as it means that protecting graph privacy is much harder than previously considered.Once privacy is guaranteed to a certain level, social media datasets are useful for various studies. One such important study is to analyze and model the information spreading patterns on social networks. Understanding how information (e.g., opinions, rumours, etc.) spreads on social networks has many benefits ranging from controlling the spread of bad rumour, identifying influential spreaders, reducing the harm of an outbreak, etc. Although there are a variety of classical diffusion models developed for epidemic spreading, they are not representative for capturing the information spread in social media. This dissertation contributes to the development of data-driven models to predict social media activity.In this line of work, we first develop methods to forecast how conversations will evolve on a social media platform. Given a set of original posts on a social platform, such as posts on Reddit in a continuous interval of time, we predict the conversation trees rooted in these seeds. For each conversation, we predict the final shape of the message tree, the user who posts each message, and the time (in continuous space) of the posting of each message. Our solution uses a probabilistic generative model with the support of a genetic algorithm and Long-Short Term Memory (LSTM) neural networks. We evaluate the proposed approach on real world conversations as appeared on subreddits related to crypto-currency and cyber-security on Reddit. We show that this technique can generate accurate conversation topological structures over time, and can accurately predict the volume of messages and the engagement of users over time.We improve this technique to predict the Twitter activities per topic of interest during a political crisis period. By their nature, periods of crisis do not include many repeatable events, thus it is difficult to learn and predict how social media users react. We use external events information as seen through the lens of physical conflict and news when improving the simulator design. Specifically, we use the time-aligned exogenous signals to predict when tweets are posted, in which topic, and by which user. We use the previously developed cascade generation model to predict the resharing activity. We evaluate this finer-granularity of simulations by the volume and temporal pattern of Twitter discussions, new user engagements and the structure of user interaction network. We show on Twitter data collected during the Venezuela political crisis that our model generates activities that follow the ground truth.
ISBN: 9798535533827Subjects--Topical Terms:
523869
Computer science.
Subjects--Index Terms:
Anonymization
Data-Driven Studies on Social Networks: Privacy and Simulation.
LDR
:06286nmm a2200373 4500
001
2285372
005
20211129133351.5
008
220723s2021 ||||||||||||||||| ||eng d
020
$a
9798535533827
035
$a
(MiAaPQ)AAI28643943
035
$a
AAI28643943
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Horawalavithana, Yasanka Sameera.
$3
3564685
245
1 0
$a
Data-Driven Studies on Social Networks: Privacy and Simulation.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
180 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
500
$a
Advisor: Iamnitchi, Adriana.
502
$a
Thesis (Ph.D.)--University of South Florida, 2021.
506
$a
This item must not be sold to any third party vendors.
520
$a
Social media datasets are fundamental to understanding a variety of phenomena, such as epidemics, adoption of behavior, crowd management, and political uprisings. At the same time, many such datasets capturing computer-mediated social interactions are recorded nowadays by individual researchers or by organizations. However, while the need for real social graphs and the supply of such datasets are well established, the flow of data from data owners to researchers is significantly hampered by privacy risks: even when humans' identities are removed, or data is anonymized to some extent, studies have proven repeatedly that re-identifying anonymized user identities (i.e., de-anonymization) is doable with high success rate.A main research challenge is to develop a principled understanding of how to measure the effectiveness of an anonymization scheme and thus, conversely, the likely success of a de-anonymization attack. This dissertation develops methods to understand what makes some graph datasets more resilient to de-anonymization attacks. We propose a data-driven framework to 1) quantify the vulnerability of a graph to a re-identification attack; 2) quantitatively identify which graph structural properties contribute most to graph vulnerability; and 3) propose guidelines to develop new methodologies related to graph anonymization, de-anonymization and graph vulnerability quantification. We show the usefulness of this framework on a large set of synthetically generated graphs with con- trolled propertied inspired from a set of real social networks. Thus, we provide an unified framework to analyze the privacy/utility trade-off imposed on any family of social graphs.We extend this data-driven framework for networks with node attributes. Using this improved framework, we quantify how much better a node re-identification attack performs when the node attributes are included in the attack compared to when there is no node attribute information available to the attacker. We quantify the privacy impact of node attributes under an attribute attachment model biased towards homophily, and analyze the interplay between graph structures and attribute information. Our results show that binary node attributes increase the chance of revealing node identity independent of their placements in the network. Further, we show that other network properties independent of the degree distribution put node privacy at risk. This improves the current understanding of graph privacy, as it means that protecting graph privacy is much harder than previously considered.Once privacy is guaranteed to a certain level, social media datasets are useful for various studies. One such important study is to analyze and model the information spreading patterns on social networks. Understanding how information (e.g., opinions, rumours, etc.) spreads on social networks has many benefits ranging from controlling the spread of bad rumour, identifying influential spreaders, reducing the harm of an outbreak, etc. Although there are a variety of classical diffusion models developed for epidemic spreading, they are not representative for capturing the information spread in social media. This dissertation contributes to the development of data-driven models to predict social media activity.In this line of work, we first develop methods to forecast how conversations will evolve on a social media platform. Given a set of original posts on a social platform, such as posts on Reddit in a continuous interval of time, we predict the conversation trees rooted in these seeds. For each conversation, we predict the final shape of the message tree, the user who posts each message, and the time (in continuous space) of the posting of each message. Our solution uses a probabilistic generative model with the support of a genetic algorithm and Long-Short Term Memory (LSTM) neural networks. We evaluate the proposed approach on real world conversations as appeared on subreddits related to crypto-currency and cyber-security on Reddit. We show that this technique can generate accurate conversation topological structures over time, and can accurately predict the volume of messages and the engagement of users over time.We improve this technique to predict the Twitter activities per topic of interest during a political crisis period. By their nature, periods of crisis do not include many repeatable events, thus it is difficult to learn and predict how social media users react. We use external events information as seen through the lens of physical conflict and news when improving the simulator design. Specifically, we use the time-aligned exogenous signals to predict when tweets are posted, in which topic, and by which user. We use the previously developed cascade generation model to predict the resharing activity. We evaluate this finer-granularity of simulations by the volume and temporal pattern of Twitter discussions, new user engagements and the structure of user interaction network. We show on Twitter data collected during the Venezuela political crisis that our model generates activities that follow the ground truth.
590
$a
School code: 0206.
650
4
$a
Computer science.
$3
523869
650
4
$a
Computer engineering.
$3
621879
650
4
$a
Sociology.
$3
516174
650
4
$a
Simulation.
$3
644748
650
4
$a
Causality.
$3
770986
650
4
$a
Data collection.
$3
3561708
650
4
$a
Datasets.
$3
3541416
650
4
$a
Privacy.
$3
528582
653
$a
Anonymization
653
$a
Graphs
653
$a
Information Diffusion
653
$a
Machine Learning
653
$a
Twitter
690
$a
0984
690
$a
0464
690
$a
0626
710
2
$a
University of South Florida.
$b
Computer Science and Engineering.
$3
1682850
773
0
$t
Dissertations Abstracts International
$g
83-03B.
790
$a
0206
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28643943
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9437105
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入