Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Linked to FindBook
Google Book
Amazon
博客來
Learning and Composing Primitives for the Visual World.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Learning and Composing Primitives for the Visual World./
Author:
Gupta, Kamal.
Description:
1 online resource (176 pages)
Notes:
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Contained By:
Dissertations Abstracts International84-12B.
Subject:
Computer science. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30421606click for full text (PQDT)
ISBN:
9798379759858
Learning and Composing Primitives for the Visual World.
Gupta, Kamal.
Learning and Composing Primitives for the Visual World.
- 1 online resource (176 pages)
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
Thesis (Ph.D.)--University of Maryland, College Park, 2023.
Includes bibliographical references
Compositionality is at the core of how humans understand and recreate the visual world. It is what allows us to express infinitely many concepts using finite primitives. For example, we understand images as a combination of objects, videos as comprising of actions, or we generate 3D animations by rendering 3D surfaces with textures, materials, and lighting. It is unsurprising to see composition also appear in almost all human-created art forms such as language, music, design, or even mathematics. Although compositionality seems an obvious and prevalent way humans consume and create data, it is often eluded in computational approaches such as deep learning. Current systems often assume the availability of exhaustive labeled concepts or primitives during training and fail to generalize to new compositions during inference. In this dissertation, we propose to discover compositional primitives from the data with little to no supervision and show how we can use these primitives for improving generalization in real-world applications such as classification, correspondence, or 2D/3D synthesis.In the first half of this dissertation, I propose two complementary approaches to discover compositional discrete primitives from visual data. Given a large collection of images without labels, I propose a generative and a contrastive way of recognizing discriminative parts in the image which are usual for visual recognition. In the generative approach, I take inspiration from bayesian approaches such as variational autoencoders, to develop a system that can express images in form of discrete language-like representation. In the contrastive approach, I play a referential game between two neural network agents, to learn meaningful discrete concepts from images. I further show applications of these approaches in image and video editing by learning a dense correspondence of primitives across images.In the second half, I'll focus on learning how to compose primitives for both 2D and 3D visual data. By expressing the scenes as an assembly of smaller parts, we can easily perform generation from scratch or from partial scenes as input. I present two works, one on composing multiple viewpoints to synthesize 3D objects, and another work on composing bounding boxes or cuboids to generate scene layouts. I also review a work on discovering a data-driven way of ordering traversing an image or a scene, for composition. I show applications of these works in image/video compression, as well as 2D and 3D content creation.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023
Mode of access: World Wide Web
ISBN: 9798379759858Subjects--Topical Terms:
523869
Computer science.
Subjects--Index Terms:
Computer graphicsIndex Terms--Genre/Form:
542853
Electronic books.
Learning and Composing Primitives for the Visual World.
LDR
:04012nmm a2200433K 4500
001
2364803
005
20231212064424.5
006
m o d
007
cr mn ---uuuuu
008
241011s2023 xx obm 000 0 eng d
020
$a
9798379759858
035
$a
(MiAaPQ)AAI30421606
035
$a
AAI30421606
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Gupta, Kamal.
$3
1558612
245
1 0
$a
Learning and Composing Primitives for the Visual World.
264
0
$c
2023
300
$a
1 online resource (176 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 84-12, Section: B.
500
$a
Advisor: Shrivastava, Abhinav;Davis, Larry S.
502
$a
Thesis (Ph.D.)--University of Maryland, College Park, 2023.
504
$a
Includes bibliographical references
520
$a
Compositionality is at the core of how humans understand and recreate the visual world. It is what allows us to express infinitely many concepts using finite primitives. For example, we understand images as a combination of objects, videos as comprising of actions, or we generate 3D animations by rendering 3D surfaces with textures, materials, and lighting. It is unsurprising to see composition also appear in almost all human-created art forms such as language, music, design, or even mathematics. Although compositionality seems an obvious and prevalent way humans consume and create data, it is often eluded in computational approaches such as deep learning. Current systems often assume the availability of exhaustive labeled concepts or primitives during training and fail to generalize to new compositions during inference. In this dissertation, we propose to discover compositional primitives from the data with little to no supervision and show how we can use these primitives for improving generalization in real-world applications such as classification, correspondence, or 2D/3D synthesis.In the first half of this dissertation, I propose two complementary approaches to discover compositional discrete primitives from visual data. Given a large collection of images without labels, I propose a generative and a contrastive way of recognizing discriminative parts in the image which are usual for visual recognition. In the generative approach, I take inspiration from bayesian approaches such as variational autoencoders, to develop a system that can express images in form of discrete language-like representation. In the contrastive approach, I play a referential game between two neural network agents, to learn meaningful discrete concepts from images. I further show applications of these approaches in image and video editing by learning a dense correspondence of primitives across images.In the second half, I'll focus on learning how to compose primitives for both 2D and 3D visual data. By expressing the scenes as an assembly of smaller parts, we can easily perform generation from scratch or from partial scenes as input. I present two works, one on composing multiple viewpoints to synthesize 3D objects, and another work on composing bounding boxes or cuboids to generate scene layouts. I also review a work on discovering a data-driven way of ordering traversing an image or a scene, for composition. I show applications of these works in image/video compression, as well as 2D and 3D content creation.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2023
538
$a
Mode of access: World Wide Web
650
4
$a
Computer science.
$3
523869
650
4
$a
Computer engineering.
$3
621879
650
4
$a
Information technology.
$3
532993
653
$a
Computer graphics
653
$a
Computer vision
653
$a
Deep learning
653
$a
Generative modeling
653
$a
Machine learning
653
$a
Natural language processing
653
$a
Visual data
655
7
$a
Electronic books.
$2
lcsh
$3
542853
690
$a
0984
690
$a
0489
690
$a
0464
690
$a
0800
710
2
$a
ProQuest Information and Learning Co.
$3
783688
710
2
$a
University of Maryland, College Park.
$b
Computer Science.
$3
1018451
773
0
$t
Dissertations Abstracts International
$g
84-12B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30421606
$z
click for full text (PQDT)
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9487159
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login