東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Visual Understanding of Complex Huma...

University of California, San Diego., Electrical Engineering (Signal and Image Proc).

FindBook

Google Book

Amazon

博客來

Visual Understanding of Complex Human Behavior via Attribute Dynamics.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Visual Understanding of Complex Human Behavior via Attribute Dynamics./
作者:	Li, Weixin.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2016,
面頁冊數:	198 p.
附註:	Source: Dissertation Abstracts International, Volume: 77-12(E), Section: B.
Contained By:	Dissertation Abstracts International77-12B(E).
標題:	Computer science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10128368
ISBN:	9781339863801

Visual Understanding of Complex Human Behavior via Attribute Dynamics.
Li, Weixin.

Visual Understanding of Complex Human Behavior via Attribute Dynamics. - Ann Arbor : ProQuest Dissertations & Theses, 2016 - 198 p.

Source: Dissertation Abstracts International, Volume: 77-12(E), Section: B.

Thesis (Ph.D.)--University of California, San Diego, 2016.

Visual understanding of human behavior in video sequences is one of the fundamental topics in computational vision. Being a sequential signal by nature, most critical insights of human activity can only be perceived via modeling the temporal structure. Despite an intuitive proposition, this task is non-trivial to accomplish. One of the most significant obstacles comes from the enormous variability and distinct properties of temporal structure at different levels of the human motion hierarchy, which spans a wide range of collectiveness, time and space, semantic granularity, and so forth. This has posed a rigorous challenge for a solution that is supposed to be capable of simultaneously capturing the instantaneous movements, encoding the mid-level evolution patterns, coping with to long-term non-stationarity or content drifts, and being invariant to intra- class variation and other visual noise.

ISBN: 9781339863801Subjects--Topical Terms:

523869
Computer science.

Visual Understanding of Complex Human Behavior via Attribute Dynamics.
LDR:04791nmm a2200325 4500 001 2159121
005 20180622095234.5
008 190424s2016 ||||||||||||||||| ||eng d
020 $a 9781339863801
035 $a (MiAaPQ)AAI10128368
035 $a (MiAaPQ)ucsd:15570
035 $a AAI10128368
040 $a MiAaPQ $c MiAaPQ
100 1 $a Li, Weixin. $3 3341056
245 1 0 $a Visual Understanding of Complex Human Behavior via Attribute Dynamics.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2016
300 $a 198 p.
500 $a Source: Dissertation Abstracts International, Volume: 77-12(E), Section: B.
500 $a Adviser: Nuno Vasconcelos.
502 $a Thesis (Ph.D.)--University of California, San Diego, 2016.
520 $a Visual understanding of human behavior in video sequences is one of the fundamental topics in computational vision. Being a sequential signal by nature, most critical insights of human activity can only be perceived via modeling the temporal structure. Despite an intuitive proposition, this task is non-trivial to accomplish. One of the most significant obstacles comes from the enormous variability and distinct properties of temporal structure at different levels of the human motion hierarchy, which spans a wide range of collectiveness, time and space, semantic granularity, and so forth. This has posed a rigorous challenge for a solution that is supposed to be capable of simultaneously capturing the instantaneous movements, encoding the mid-level evolution patterns, coping with to long-term non-stationarity or content drifts, and being invariant to intra- class variation and other visual noise.
520 $a While most of the previous works in the literature focus on addressing some aspects of this problem, we aim to develop a unified framework to handle them all for complex human activity analysis. Specifically, we propose to model the temporal structure of human behavior on a robust, stable yet general representation platform that encodes some semantically meaningful concepts (or attributes). This platform bridges the gap between low-level visual feature and the high-level logical reasoning, bringing in benefits such as better generalization, knowledge transfer, and so forth. While attributes take care of abstracting semantic information from short-term motion in low-level visual signal, the dynamic model focuses on charactering the mid-range evolution patterns in this space. To cope with long-term non-stationarity and intra-class variation for complex events, we derive two encoding schemes that capture the zeroth and first order statistics of the attribute dynamics in video snippets, instead of precisely characterizing the whole sequence, which is prone to over-fitting due to the sparse nature of complex event instantiation.
520 $a The proposed framework is implemented via several novel models, together with the corresponding technical tools for statistical inference, parameter estimation, similarity measure, encoding statistics at the model manifold, and so on. In particular, a dynamic model is proposed to capture the evolution pattern in sequential binary data, denoted the binary dynamic system (BDS), which consists of a binary principal component analysis for modeling appearance and Gauss-Markov process to encode dynamics. A mixture model is further derived from BDS to characterize multiple types of dynamics in a large data corpus. Based on variational methods, an accurate and efficient approximate inference scheme is developed for the state posterior to handle the intrinsic intractability; and a variational expectation-maximization algorithm is also derived for parameter estimation. Through these tools, measurements that quantify the similarity or dissimilarity of two binary sequences are devised from the perspective of control theory, information geometry, and kernel methods. Besides, approaches to encode the statistics of sequential binary data in the manifold of statistical models are proposed, resulting in the bag-of-words for attribute dynamics (BoWAD) and vector of locally aggregated descriptor for attribute dynamics (VLADAD).
520 $a Empirical study on challenging tasks of complex human activity analysis justifies the effectiveness of the proposed framework. Our solution not only produces the state-of-the-art results for event detection, but also enables recount- ing that provides the visual evidence anchored over time in the video for the prediction, and facilitates tasks like semantic video segmentation, content based video summarization, and so forth.
590 $a School code: 0033.
650 4 $a Computer science. $3 523869
690 $a 0984
710 2 $a University of California, San Diego. $b Electrical Engineering (Signal and Image Proc). $3 1020523
773 0 $t Dissertation Abstracts International $g 77-12B(E).
790 $a 0033
791 $a Ph.D.
792 $a 2016
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10128368