東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

FindBook

Google Book

Amazon

博客來

Towards Comprehensive Action Understanding in Videos.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Towards Comprehensive Action Understanding in Videos./
作者:	Ji, Jingwei.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	141 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Contained By:	Dissertations Abstracts International83-03B.
標題:	Cooperative learning. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28688315
ISBN:	9798544203698

Towards Comprehensive Action Understanding in Videos.
Ji, Jingwei.

Towards Comprehensive Action Understanding in Videos. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 141 p.

Source: Dissertations Abstracts International, Volume: 83-03, Section: B.

Thesis (Ph.D.)--Stanford University, 2021.

This item must not be sold to any third party vendors.

An enormous amount of videos are created, spread, and watched daily. In the ocean of videos, the actions and activities of humans are often the pivots. We desire machines to understand human actions in videos as this is essential to various applications, including but not limited to healthcare, security system, and human-robot interactions. For these applications to be realized, action understanding must go beyond simply answering "what is the action", but more comprehensive. An intelligent agent should be able to know "who/where is the actor", "what/where is the object", "what interaction is happening between the actor and the object", "when does an action start and end", and more. Achieving comprehensive action understanding is non-trivial since the need for data and labels combinatorially increases when trying to solve multiple problems, not to mention that video data and labels are expensive to collect, store, and consume. Therefore, to obtain comprehensive action understanding, we not only need to perform multiple tasks but also have to ensure data eciency.In this dissertation, we discuss three questions to realize data-ecient and comprehensive action understanding. How to reduce the need for data and labels? How to perform multiple tasks without combinatorial growth of data? How to solve new problems eciently with some other problems solved? For the first question, our works on few-shot video classification and semi-supervised temporal action proposals introduce video-specific techniques and strategies for learning with less supervision. For the second question, we demonstrate how to avoid enumerating all combinations of categories from subtasks by knowledge disentanglement in a study on actor-action segmentation. For the third question, we propose constructing compositional representation from human-object relationships in videos, and such representation leads to better generalizability in action recognition models.

ISBN: 9798544203698Subjects--Topical Terms:

3682760
Cooperative learning.

Towards Comprehensive Action Understanding in Videos.
LDR:02927nmm a2200301 4500 001 2348624
005 20220912135621.5
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798544203698
035 $a (MiAaPQ)AAI28688315
035 $a (MiAaPQ)STANFORDwc099nh9969
035 $a AAI28688315
040 $a MiAaPQ $c MiAaPQ
100 1 $a Ji, Jingwei. $3 3687989
245 1 0 $a Towards Comprehensive Action Understanding in Videos.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 141 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
500 $a Advisor: Li, Fei-Fei.
502 $a Thesis (Ph.D.)--Stanford University, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a An enormous amount of videos are created, spread, and watched daily. In the ocean of videos, the actions and activities of humans are often the pivots. We desire machines to understand human actions in videos as this is essential to various applications, including but not limited to healthcare, security system, and human-robot interactions. For these applications to be realized, action understanding must go beyond simply answering "what is the action", but more comprehensive. An intelligent agent should be able to know "who/where is the actor", "what/where is the object", "what interaction is happening between the actor and the object", "when does an action start and end", and more. Achieving comprehensive action understanding is non-trivial since the need for data and labels combinatorially increases when trying to solve multiple problems, not to mention that video data and labels are expensive to collect, store, and consume. Therefore, to obtain comprehensive action understanding, we not only need to perform multiple tasks but also have to ensure data eciency.In this dissertation, we discuss three questions to realize data-ecient and comprehensive action understanding. How to reduce the need for data and labels? How to perform multiple tasks without combinatorial growth of data? How to solve new problems eciently with some other problems solved? For the first question, our works on few-shot video classification and semi-supervised temporal action proposals introduce video-specific techniques and strategies for learning with less supervision. For the second question, we demonstrate how to avoid enumerating all combinations of categories from subtasks by knowledge disentanglement in a study on actor-action segmentation. For the third question, we propose constructing compositional representation from human-object relationships in videos, and such representation leads to better generalizability in action recognition models.
590 $a School code: 0212.
650 4 $a Cooperative learning. $3 3682760
650 4 $a Genomes. $3 592593
650 4 $a Localization. $3 3560711
650 4 $a Ablation. $3 3562462
650 4 $a Graph representations. $3 3560730
650 4 $a Semantics. $3 520060
650 4 $a Genetics. $3 530508
650 4 $a Accuracy. $3 3559958
650 4 $a Datasets. $3 3541416
650 4 $a Methods. $3 3560391
650 4 $a Experiments. $3 525909
650 4 $a Classification. $3 595585
690 $a 0369
710 2 $a Stanford University. $3 754827
773 0 $t Dissertations Abstracts International $g 83-03B.
790 $a 0212
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28688315