語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Supervised and Unsupervised Learning...
~
Liu, Yu.
FindBook
Google Book
Amazon
博客來
Supervised and Unsupervised Learning for Semantics Distillation in Multimedia Processing.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Supervised and Unsupervised Learning for Semantics Distillation in Multimedia Processing./
作者:
Liu, Yu.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2018,
面頁冊數:
150 p.
附註:
Source: Dissertation Abstracts International, Volume: 80-02(E), Section: B.
Contained By:
Dissertation Abstracts International80-02B(E).
標題:
Computer science. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10932367
ISBN:
9780438457089
Supervised and Unsupervised Learning for Semantics Distillation in Multimedia Processing.
Liu, Yu.
Supervised and Unsupervised Learning for Semantics Distillation in Multimedia Processing.
- Ann Arbor : ProQuest Dissertations & Theses, 2018 - 150 p.
Source: Dissertation Abstracts International, Volume: 80-02(E), Section: B.
Thesis (Ph.D.)--State University of New York at Buffalo, 2018.
In linguistic, "semantics" stands for the intended meaning in natural language, such as in words, phrases and sentences. In this dissertation, the concept "semantics" is defined more generally: the intended meaning of information in all multimedia forms. The multimedia forms include language domain text, as well as vision domain stationary images and dynamic videos. Specifically, semantics in multimedia are the media content of cognitive information, knowledge and idea that can be represented in text, images and video clips. A narrative story, for example, can be semantics summary of a novel book, or semantics summary of the movie originated from that book. Thus, semantic is a high level abstract knowledge that is independent from multimedia forms.
ISBN: 9780438457089Subjects--Topical Terms:
523869
Computer science.
Supervised and Unsupervised Learning for Semantics Distillation in Multimedia Processing.
LDR
:06303nmm a2200337 4500
001
2201348
005
20190429062347.5
008
201008s2018 ||||||||||||||||| ||eng d
020
$a
9780438457089
035
$a
(MiAaPQ)AAI10932367
035
$a
(MiAaPQ)buffalo:16107
035
$a
AAI10932367
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Liu, Yu.
$3
1259717
245
1 0
$a
Supervised and Unsupervised Learning for Semantics Distillation in Multimedia Processing.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2018
300
$a
150 p.
500
$a
Source: Dissertation Abstracts International, Volume: 80-02(E), Section: B.
500
$a
Adviser: Chang Wen Chen.
502
$a
Thesis (Ph.D.)--State University of New York at Buffalo, 2018.
520
$a
In linguistic, "semantics" stands for the intended meaning in natural language, such as in words, phrases and sentences. In this dissertation, the concept "semantics" is defined more generally: the intended meaning of information in all multimedia forms. The multimedia forms include language domain text, as well as vision domain stationary images and dynamic videos. Specifically, semantics in multimedia are the media content of cognitive information, knowledge and idea that can be represented in text, images and video clips. A narrative story, for example, can be semantics summary of a novel book, or semantics summary of the movie originated from that book. Thus, semantic is a high level abstract knowledge that is independent from multimedia forms.
520
$a
Indeed, the same amount of semantics can be represented either redundantly or concisely, due to diversified levels of expression ability of multimedia. The process of a redundantly represented semantics evolving into a concisely represented one is called "semantic distillation". And this evolving process can happen either in between different multimedia forms, or within the same form.
520
$a
The booming growth of unorganized and unfiltered information is bringing to people an unwanted issue, information overload, where techniques of semantic distillation are in high demand. However, as opportunities always be side with challenges, machine learning and Artificial Intelligence (AI) today become far more advanced than that in the past, and provide with us powerful tools and techniques. Large varieties of learning methods has made countless of impossible tasks come to reality. Thus in this dissertation, we take advantages of machine learning techniques, with both supervised learning and unsupervised learning, to empower the solving of semantics distillation problems.
520
$a
Despite the promising future and powerful machine learning techniques, the heterogeneous forms of multimedia involving many domains still impose challenges to semantics distillation approaches. A major challenge is the definition of "semantics" and the related processing techniques can be entirely different from one problem to another. Varying types of multimedia resources can introduce varying kinds of domain-specific limitations and constraints, where the obtaining of semantics also becomes domain-specific. Therefore, in this dissertation, with text language and vision as the two major domains, we approach four problems of all combinations of the two domains: • Language to Vision Domain: In this study, Presentation Storytelling is formulated as a problem that suggesting the most appropriate images from online sources for storytelling purpose given a text query. Particularly, we approach the problem with a two-step semantics processing method, where the semantics from a simple query is first expanded to a diverse semantic graph, and then distilled from a large number of searched web photos to a few representative ones. This two-step method is empowered by Conditional Random Field (CRF) model, and learned in supervised manner with human-labeled examples. • Vision to Language Domain: The second study, Visual Storytelling, formulates a problem of generating a coherent paragraph from a photo stream. Different from presentation storytelling, visual storytelling goes in opposite way: the semantics extracted from a handful photos are distilled into text. In this dissertation, we address this problem by revealing the semantics relationships in visual domain, and distilled into language domain with a new designed Bidirectional Attention Recurrent Neural Network (BARNN) model. Particularly, an attention model is embedded to the RNN so that the coherence can be preserved in language domain at the output being a human-like story. The model is trained with deep learning and supervised learning with public datasets. • Dedicated Vision Domain: To directly approach the information overload issue in vision domain, Image Semantic Extraction formulates a problem that selects a subset from multimedia user's photo albums. In the literature, this problem has mostly been approached with unsupervised learning process. However, in this dissertation, we develop a novel supervised learning method to attach the same problem. We specify visual semantics as a quantizable variables and can be measured, and build an encoding-decoding pipeline with Long-Short-Term-Memory (LSTM) to model this quantization process. The intuition of encoding-decoding pipeline is to imitate human: read-think-and-retell. That is, the pipeline first includes an LSTM encoder scanning all photos for "reading" comprised semantics, then concatenates with an LSTM decoder selecting the most representative ones for "thinking" the gist semantics, finally adds a dedicated residual layer revisiting the unselected ones for "verifying" if the semantics are complete enough. • Dedicated Language Domain: Distinct from above issues, in this part, we introduce a different genre of machine learning method, unsupervised learning. We will address a semantics distillation problem in language domain, Text Semantic Extraction, where the semantics in a letter sequence are extracted from printed images. (Abstract shortened by ProQuest.).
590
$a
School code: 0656.
650
4
$a
Computer science.
$3
523869
650
4
$a
Computer engineering.
$3
621879
690
$a
0984
690
$a
0464
710
2
$a
State University of New York at Buffalo.
$b
Computer Science and Engineering.
$3
1035503
773
0
$t
Dissertation Abstracts International
$g
80-02B(E).
790
$a
0656
791
$a
Ph.D.
792
$a
2018
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10932367
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9377897
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入