東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Modeling Social and Temporal Context...

Qin, Zhen.

FindBook

Google Book

Amazon

博客來

Modeling Social and Temporal Context for Video Analysis.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Modeling Social and Temporal Context for Video Analysis./
作者:	Qin, Zhen.
面頁冊數:	106 p.
附註:	Source: Dissertation Abstracts International, Volume: 77-01(E), Section: B.
Contained By:	Dissertation Abstracts International77-01B(E).
標題:	Computer science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3721547
ISBN:	9781339031651

Modeling Social and Temporal Context for Video Analysis.
Qin, Zhen.

Modeling Social and Temporal Context for Video Analysis. - 106 p.

Source: Dissertation Abstracts International, Volume: 77-01(E), Section: B.

Thesis (Ph.D.)--University of California, Riverside, 2015.

The ubiquity of videos requires effective content extraction tools to enable practical applications automatically. Computer vision research focuses on bridging the gap between raw data (pixel values) and video semantics, but information based only on image values are not sufficient, due to the visual ambiguities caused by varied camera characteristics, frequent occlusions, low resolution, large intra-class and small inter-class variation among object/activity/event classes, etc.

ISBN: 9781339031651Subjects--Topical Terms:

523869
Computer science.

Modeling Social and Temporal Context for Video Analysis.
LDR:04170nmm a2200289 4500 001 2066678
005 20151205153429.5
008 170521s2015 ||||||||||||||||| ||eng d
020 $a 9781339031651
035 $a (MiAaPQ)AAI3721547
035 $a AAI3721547
040 $a MiAaPQ $c MiAaPQ
100 1 $a Qin, Zhen. $3 1685463
245 1 0 $a Modeling Social and Temporal Context for Video Analysis.
300 $a 106 p.
500 $a Source: Dissertation Abstracts International, Volume: 77-01(E), Section: B.
500 $a Adviser: Christian R. Shelton.
502 $a Thesis (Ph.D.)--University of California, Riverside, 2015.
520 $a The ubiquity of videos requires effective content extraction tools to enable practical applications automatically. Computer vision research focuses on bridging the gap between raw data (pixel values) and video semantics, but information based only on image values are not sufficient, due to the visual ambiguities caused by varied camera characteristics, frequent occlusions, low resolution, large intra-class and small inter-class variation among object/activity/event classes, etc.
520 $a In this dissertation, we develop methodologies with new machine learning and statistical optimization techniques to model high-level context to mitigate visual ambiguity, thus improving performance on several real-world computer vision tasks. We first describe the usage of social grouping context, supported by sociology research, to improve intra-camera multi-target tracking, inter-camera multi-target tracking, and head pose estimation in video. For single-camera tracking, social grouping context regularizes existing tracking methods in a principled way, and provides a natural solution to go beyond traditional tracking with Markovian assumptions. For multi-camera tracking, social grouping context effectively mitigates visual ambiguities from cameras with different viewpoints and lighting conditions. Both problems unify under a probabilistic formulation, and we provide a novel effective routine for the constrained nonlinear optimization problem that jointly conducts tracking and social grouping. We also show that social grouping context helps head pose estimation, which is challenging due to the small sized head images in typical high-angle surveillance videos. A Conditional Random Field (CRF) framework is used to perform group head pose labeling, in which interactions among group members are encoded. The model generalizes existing methods that only focus on individuals, and allows exact learning and inference.
520 $a We further explore temporal context for an important computer vision task, i.e. video event localization and recognition. We study a new model from machine learning, called Piecewise-constant Conditional Intensity Model (PCIM), which is able to model complex dependencies in general event streams. We first develop a general-purpose inference algorithm for PCIM by designing an auxiliary Gibbs sampler. The sampler alternates between sampling a finite set of auxiliary virtual events with adaptive rates, and performing an efficient forward-backward pass at discrete times to generate samples. We show that our sampler is the first in literature to successfully perform inference tasks in both Markovian and non-Markovian PCIM models, and can be employed in Expectation-Maximization-based parameter estimation and structural learning for PCIM with partially observed data. We then show that the problem of video event localization and recognition can be modeled as the inference of high-level events given low-level observations in a PCIM. Our approach provides a principled way to learn an interpretable model that utilizes dependencies among events (both high-level and low-level), while existing methods mainly focus on local information. We observe that temporal context helps to mitigate visual ambiguities, especially between events with similar local appearances.
590 $a School code: 0032.
650 4 $a Computer science. $3 523869
690 $a 0984
710 2 $a University of California, Riverside. $b Computer Science. $3 1680199
773 0 $t Dissertation Abstracts International $g 77-01B(E).
790 $a 0032
791 $a Ph.D.
792 $a 2015
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3721547