語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Label Scarcity in Computer Vision : = From Long Tail to Zero-Shot.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Label Scarcity in Computer Vision :/
其他題名:
From Long Tail to Zero-Shot.
作者:
Huang, He.
面頁冊數:
1 online resource (140 pages)
附註:
Source: Dissertations Abstracts International, Volume: 84-02, Section: B.
Contained By:
Dissertations Abstracts International84-02B.
標題:
Computer science. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29330661click for full text (PQDT)
ISBN:
9798834055419
Label Scarcity in Computer Vision : = From Long Tail to Zero-Shot.
Huang, He.
Label Scarcity in Computer Vision :
From Long Tail to Zero-Shot. - 1 online resource (140 pages)
Source: Dissertations Abstracts International, Volume: 84-02, Section: B.
Thesis (Ph.D.)--University of Illinois at Chicago, 2022.
Includes bibliographical references
In the era of big data, we have access to various sources of potentially unlimited data, but collecting labels for those data is still very costly for computer vision. For example, object detection requires the images to be annotated with labels and bounding boxes for all objects, and instance segmentation requires pixel level annotation of images. Given the limited budget and the non-uniform distribution of real world data, the available labels we have usually follows a long tail distribution, where some frequent classes have a lot of annotations while rare classes have very few labels. With the rapid growth of the Internet, people create new content and concepts almost every day, and it is hard for machines to recognize and classify such novel content, which gives rise to another kind of label scarcity named zero-shot recognition, where we want to train models to recognize new classes that they never see during training. In this work, we study the two types of label scarcity (i.e., long tail distribution of classes and novel classes without annotations) in different applications. On one hand, we study dealing with long tail distribution in scene graph parsing, which requires the model to not only detect objects in the input images but also predict the relations between those objects. We propose a general framework that can be applied to and improve many existing models, by decomposing the problem into classification and ranking sub-problems. On the other hand, to deal with label scarcity caused by novel classes with no annotations, we design generative models as well as utilize external knowledge from text to solve different zero-shot recognition problems in image classification. Specifically, we propose a unified framework for single label zero-shot recognition with generative adversarial networks, and use graph convolutional networks to bridge the gap between seen and unseen classes for multi-label zero-shot image recognition. Additionally, we propose a translational embedding model that recognize new attribute-object compositions. All the works mentioned above use open-source public datasets like ImageNet, MS-COCO, NUS-WIDE and CUB.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023
Mode of access: World Wide Web
ISBN: 9798834055419Subjects--Topical Terms:
523869
Computer science.
Subjects--Index Terms:
Deep LearningIndex Terms--Genre/Form:
542853
Electronic books.
Label Scarcity in Computer Vision : = From Long Tail to Zero-Shot.
LDR
:03580nmm a2200409K 4500
001
2362347
005
20231027104009.5
006
m o d
007
cr mn ---uuuuu
008
241011s2022 xx obm 000 0 eng d
020
$a
9798834055419
035
$a
(MiAaPQ)AAI29330661
035
$a
(MiAaPQ)0799vireo3039Huang
035
$a
AAI29330661
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Huang, He.
$3
1672711
245
1 0
$a
Label Scarcity in Computer Vision :
$b
From Long Tail to Zero-Shot.
264
0
$c
2022
300
$a
1 online resource (140 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 84-02, Section: B.
500
$a
Advisor: Yu, Philip S.
502
$a
Thesis (Ph.D.)--University of Illinois at Chicago, 2022.
504
$a
Includes bibliographical references
520
$a
In the era of big data, we have access to various sources of potentially unlimited data, but collecting labels for those data is still very costly for computer vision. For example, object detection requires the images to be annotated with labels and bounding boxes for all objects, and instance segmentation requires pixel level annotation of images. Given the limited budget and the non-uniform distribution of real world data, the available labels we have usually follows a long tail distribution, where some frequent classes have a lot of annotations while rare classes have very few labels. With the rapid growth of the Internet, people create new content and concepts almost every day, and it is hard for machines to recognize and classify such novel content, which gives rise to another kind of label scarcity named zero-shot recognition, where we want to train models to recognize new classes that they never see during training. In this work, we study the two types of label scarcity (i.e., long tail distribution of classes and novel classes without annotations) in different applications. On one hand, we study dealing with long tail distribution in scene graph parsing, which requires the model to not only detect objects in the input images but also predict the relations between those objects. We propose a general framework that can be applied to and improve many existing models, by decomposing the problem into classification and ranking sub-problems. On the other hand, to deal with label scarcity caused by novel classes with no annotations, we design generative models as well as utilize external knowledge from text to solve different zero-shot recognition problems in image classification. Specifically, we propose a unified framework for single label zero-shot recognition with generative adversarial networks, and use graph convolutional networks to bridge the gap between seen and unseen classes for multi-label zero-shot image recognition. Additionally, we propose a translational embedding model that recognize new attribute-object compositions. All the works mentioned above use open-source public datasets like ImageNet, MS-COCO, NUS-WIDE and CUB.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2023
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
523869
650
4
$a
Web studies.
$3
2122754
650
4
$a
Information science.
$3
554358
653
$a
Deep Learning
653
$a
Computer vision
653
$a
Zero-shot learning
653
$a
Image classification
655
7
$a
Electronic books.
$2
lcsh
$3
542853
690
$a
0984
690
$a
0723
690
$a
0646
690
$a
0800
710
2
$a
ProQuest Information and Learning Co.
$3
783688
710
2
$a
University of Illinois at Chicago.
$b
Computer Science.
$3
2094830
773
0
$t
Dissertations Abstracts International
$g
84-02B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29330661
$z
click for full text (PQDT)
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9484703
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入