語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Bootstrapping Web Archive Collection...
~
Nwala, Alexander C.
FindBook
Google Book
Amazon
博客來
Bootstrapping Web Archive Collections from Micro-Collections in Social Media.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Bootstrapping Web Archive Collections from Micro-Collections in Social Media./
作者:
Nwala, Alexander C.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2020,
面頁冊數:
287 p.
附註:
Source: Dissertations Abstracts International, Volume: 82-04, Section: B.
Contained By:
Dissertations Abstracts International82-04B.
標題:
Web studies. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28090305
ISBN:
9798672170343
Bootstrapping Web Archive Collections from Micro-Collections in Social Media.
Nwala, Alexander C.
Bootstrapping Web Archive Collections from Micro-Collections in Social Media.
- Ann Arbor : ProQuest Dissertations & Theses, 2020 - 287 p.
Source: Dissertations Abstracts International, Volume: 82-04, Section: B.
Thesis (Ph.D.)--Old Dominion University, 2020.
This item must not be sold to any third party vendors.
In a Web plagued by disappearing resources, Web archive collections provide a valuable means of preserving Web resources important to the study of past events. These archived collections start with seed URIs (Uniform Resource Identifiers) hand-selected by curators. Curators produce high quality seeds by removing non-relevant URIs and adding URIs from credible and authoritative sources, but this ability comes at a cost: it is time consuming to collect these seeds. The result of this is a shortage of curators, a lack of Web archive collections for various important news events, and a need for an automatic system for generating seeds.We investigate the problem of generating seed URIs automatically, and explore the state of the art in collection building and seed selection. Attempts toward generating seeds automatically have mostly relied on scraping Web or social media Search Engine Result Pages (SERPs). In this work, we introduce a novel source for generating seeds from URIs in the threaded conversations of social media posts created by single or multiple users. Users on social media sites routinely create and share narratives about news events consisting of hand-selected URIs of news stories, tweets, videos, etc. In this work, we call these posts Micro-collections, whether shared on Reddit or Twitter, and we consider them as an important source for seeds. This is because, the effort taken to create Micro-collections is an indication of editorial activity and a demonstration of domain expertise. Therefore, we propose a model for generating seeds from Micro-collections. We begin by introducing a simple vocabulary, called post class for describing social media posts across different platforms, and extract seeds from the Micro-collections post class. We further propose Quality Proxies for seeds by extending the idea of collection comparison to evaluation, and present our Micro-collection/Quality Proxy (MCQP) framework for bootstrapping Web archive collections from Micro-collections in social media.
ISBN: 9798672170343Subjects--Topical Terms:
2122754
Web studies.
Subjects--Index Terms:
Micro-collection
Bootstrapping Web Archive Collections from Micro-Collections in Social Media.
LDR
:03186nmm a2200373 4500
001
2233008
005
20210628081323.5
008
210928s2020 ||||||||||||||||| ||eng d
020
$a
9798672170343
035
$a
(MiAaPQ)AAI28090305
035
$a
AAI28090305
035
$a
2233008
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Nwala, Alexander C.
$0
(orcid)0000-0003-3408-791X
$3
3480610
245
1 0
$a
Bootstrapping Web Archive Collections from Micro-Collections in Social Media.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2020
300
$a
287 p.
500
$a
Source: Dissertations Abstracts International, Volume: 82-04, Section: B.
500
$a
Advisor: Nelson, Michael L.;Weigle, Michele C.
502
$a
Thesis (Ph.D.)--Old Dominion University, 2020.
506
$a
This item must not be sold to any third party vendors.
520
$a
In a Web plagued by disappearing resources, Web archive collections provide a valuable means of preserving Web resources important to the study of past events. These archived collections start with seed URIs (Uniform Resource Identifiers) hand-selected by curators. Curators produce high quality seeds by removing non-relevant URIs and adding URIs from credible and authoritative sources, but this ability comes at a cost: it is time consuming to collect these seeds. The result of this is a shortage of curators, a lack of Web archive collections for various important news events, and a need for an automatic system for generating seeds.We investigate the problem of generating seed URIs automatically, and explore the state of the art in collection building and seed selection. Attempts toward generating seeds automatically have mostly relied on scraping Web or social media Search Engine Result Pages (SERPs). In this work, we introduce a novel source for generating seeds from URIs in the threaded conversations of social media posts created by single or multiple users. Users on social media sites routinely create and share narratives about news events consisting of hand-selected URIs of news stories, tweets, videos, etc. In this work, we call these posts Micro-collections, whether shared on Reddit or Twitter, and we consider them as an important source for seeds. This is because, the effort taken to create Micro-collections is an indication of editorial activity and a demonstration of domain expertise. Therefore, we propose a model for generating seeds from Micro-collections. We begin by introducing a simple vocabulary, called post class for describing social media posts across different platforms, and extract seeds from the Micro-collections post class. We further propose Quality Proxies for seeds by extending the idea of collection comparison to evaluation, and present our Micro-collection/Quality Proxy (MCQP) framework for bootstrapping Web archive collections from Micro-collections in social media.
590
$a
School code: 0418.
650
4
$a
Web studies.
$3
2122754
650
4
$a
Computer science.
$3
523869
650
4
$a
Library science.
$3
539284
653
$a
Micro-collection
653
$a
News
653
$a
Social Media
653
$a
Web Archiving
690
$a
0646
690
$a
0984
690
$a
0399
710
2 0
$a
Old Dominion University.
$b
Computer Science.
$3
3280020
773
0
$t
Dissertations Abstracts International
$g
82-04B.
790
$a
0418
791
$a
Ph.D.
792
$a
2020
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28090305
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9396918
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入