東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Video content analysis and its appli...

Wang, Feng.

FindBook

Google Book

Amazon

博客來

Video content analysis and its applications for multimedia authoring of presentations.

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	Video content analysis and its applications for multimedia authoring of presentations./
作者:	Wang, Feng.
面頁冊數:	153 p.
附註:	Advisers: Ting-Chuen Pong; Chong-Wah Ngo.
Contained By:	Dissertation Abstracts International69-01B.
標題:	Computer Science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3298516
ISBN:	9780549419877

Video content analysis and its applications for multimedia authoring of presentations.
Wang, Feng.

Video content analysis and its applications for multimedia authoring of presentations. - 153 p.

Advisers: Ting-Chuen Pong; Chong-Wah Ngo.

Thesis (Ph.D.)--Hong Kong University of Science and Technology (Hong Kong), 2006.

Video content analysis has attracted more and more researchers' attentions in recent years, due to the availability of a growing amount of digital video data. In this work, we address the problem of video content analysis by extracting three high-level features in videos namely text, gesture and head pasture, and employ them in several applications for multimedia authoring of presentations based on video understanding.

ISBN: 9780549419877Subjects--Topical Terms:

626642
Computer Science.

Video content analysis and its applications for multimedia authoring of presentations.
LDR:04311nam 2200325 a 45 001 958744
005 20110704
008 110704s2006 ||||||||||||||||| ||eng d
020 $a 9780549419877
035 $a (UMI)AAI3298516
035 $a AAI3298516
040 $a UMI $c UMI
100 1 $a Wang, Feng. $3 1037905
245 1 0 $a Video content analysis and its applications for multimedia authoring of presentations.
300 $a 153 p.
500 $a Advisers: Ting-Chuen Pong; Chong-Wah Ngo.
500 $a Source: Dissertation Abstracts International, Volume: 69-01, Section: B, page: 0441.
502 $a Thesis (Ph.D.)--Hong Kong University of Science and Technology (Hong Kong), 2006.
520 $a Video content analysis has attracted more and more researchers' attentions in recent years, due to the availability of a growing amount of digital video data. In this work, we address the problem of video content analysis by extracting three high-level features in videos namely text, gesture and head pasture, and employ them in several applications for multimedia authoring of presentations based on video understanding.
520 $a For text analysis, we address the problem of text recognition in low-resolution videos. A novel algorithm for video text super-resolution is proposed, which reconstructs high-resolution textboxes by integrating multiple frames. Our experiments show that text recognition is significantly improved after super-resolution.
520 $a For gesture detection and recognition, we propose algorithms for both off-line and real-time applications. In the former, to deal with the lack of salient features in gesture detection, different cues including frame difference, skin color and gesture trajectory are combined to detect candidate gestures. HMM (Hidden Markov Model) based gesture recognition is then employed to refine the results of gesture detection and extract intentional gestures. For real-time applications, to cope with the efficiency requirements besides accuracy, the JIM I models for complete gesture recognition are modified to recognize incomplete gestures, so that a gesture can be identified before the complete trajectory is observed. Speech is combined with visual cue to further improve the accuracy and the responsiveness of gesture detection.
520 $a For head posture, two different algorithms are proposed to estimate the face orientation. The first one is more appropriate for offline applications by employing visual cue and image processing techniques. In the second algorithm, besides visual cue, we focus more on effectively exploiting contextual information, i.e. temporal smoothness of head movement to refine the pose estimation. This is useful especially for low-resolution images where direct estimation from one single image is not reliable enough. We propose an adaptive online learning approach to deal with different presenting styles. The second algorithm is efficient enough for most real-time applications.
520 $a Based on the video content analysis, we employ the extracted features to develop several applications, including the synchronization of video and external documents based on text analysis, the offline video enhancement and editing by integrating gesture, posture and text, and a simulated smartboard to show the effectiveness and efficiency of the proposed algorithms. Specifically for video editing, a novel gesture and posture driven editing approach is proposed to trace the flow of lecturing, by attending to the focus of lecturing at any moment. Meanwhile, the aesthetic elements, which outline the general and basic rules of selecting and adjoining various views of focuses, are taken into account to generate the appropriate rhythm for showing the dynamic interactions between the presenter and the focuses. To improve the visual readabilities of the projected and handwritten words of the edited video, two approaches are also proposed to enhance the visibility of texts on the LCD projected screen and the whiteboard respectively.
590 $a School code: 1223.
650 4 $a Computer Science. $3 626642
690 $a 0984
710 2 $a Hong Kong University of Science and Technology (Hong Kong). $3 1022235
773 0 $t Dissertation Abstracts International $g 69-01B.
790 $a 1223
790 1 0 $a Ngo, Chong-Wah, $e advisor
790 1 0 $a Pong, Ting-Chuen, $e advisor
791 $a Ph.D.
792 $a 2006
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3298516