東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

A Multitask Learning Encoder-N-Decod...

Nina, Oliver.

FindBook

Google Book

Amazon

博客來

A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Descriptions.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Descriptions./
作者:	Nina, Oliver.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2018,
面頁冊數:	144 p.
附註:	Source: Dissertations Abstracts International, Volume: 80-05, Section: B.
Contained By:	Dissertations Abstracts International80-05B.
標題:	Computer Engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=11005532
ISBN:	9780438591813

A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Descriptions.
Nina, Oliver.

A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Descriptions. - Ann Arbor : ProQuest Dissertations & Theses, 2018 - 144 p.

Source: Dissertations Abstracts International, Volume: 80-05, Section: B.

Thesis (Ph.D.)--The Ohio State University, 2018.

This item must not be sold to any third party vendors.

Learning visual feature representations for video analysis is non-trivial and requires a large amount of training samples and a proper generalization framework. Many of the current state of the art methods for video captioning and movie description rely on simple encoding mechanisms through recurrent neural networks to encode temporal visual information extracted from video data. We introduce a novel multitask encoder-n-decoder framework for automatic semantic description and captioning of video sequences. In contrast to current approaches, at training time our method relies on multiple distinct decoders to train a visual encoder in a multitask fashion. Our method shows improved performance over current SotA methods in several metrics on both multi-caption and single-caption datasets. Our method is the first method to use a multi-task approach for encoding video features. Furthermore, based on human subject evaluations, our method was ranked as the most helpful algorithm for the visually impaired finishing first place at the Large Scale Movie Description Challenge (LSMDC) in the movie captioning task in conjunction with the International Conference in Computer Vision (ICCV) 2017. Our method won the competition task among other top participating research groups worldwide and is currently the state of the art on automatic commercial movie description.

ISBN: 9780438591813Subjects--Topical Terms:

1567821
Computer Engineering.

A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Descriptions.
LDR:02522nmm a2200337 4500 001 2207846
005 20190923114241.5
008 201008s2018 ||||||||||||||||| ||eng d
020 $a 9780438591813
035 $a (MiAaPQ)AAI11005532
035 $a (MiAaPQ)OhioLINK:osu1531996548147165
035 $a AAI11005532
040 $a MiAaPQ $c MiAaPQ
100 1 $a Nina, Oliver. $3 3434848
245 1 0 $a A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Descriptions.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2018
300 $a 144 p.
500 $a Source: Dissertations Abstracts International, Volume: 80-05, Section: B.
500 $a Publisher info.: Dissertation/Thesis.
502 $a Thesis (Ph.D.)--The Ohio State University, 2018.
506 $a This item must not be sold to any third party vendors.
506 $a This item must not be added to any third party search indexes.
520 $a Learning visual feature representations for video analysis is non-trivial and requires a large amount of training samples and a proper generalization framework. Many of the current state of the art methods for video captioning and movie description rely on simple encoding mechanisms through recurrent neural networks to encode temporal visual information extracted from video data. We introduce a novel multitask encoder-n-decoder framework for automatic semantic description and captioning of video sequences. In contrast to current approaches, at training time our method relies on multiple distinct decoders to train a visual encoder in a multitask fashion. Our method shows improved performance over current SotA methods in several metrics on both multi-caption and single-caption datasets. Our method is the first method to use a multi-task approach for encoding video features. Furthermore, based on human subject evaluations, our method was ranked as the most helpful algorithm for the visually impaired finishing first place at the Large Scale Movie Description Challenge (LSMDC) in the movie captioning task in conjunction with the International Conference in Computer Vision (ICCV) 2017. Our method won the competition task among other top participating research groups worldwide and is currently the state of the art on automatic commercial movie description.
590 $a School code: 0168.
650 4 $a Computer Engineering. $3 1567821
650 4 $a Electrical engineering. $3 649834
650 4 $a Computer science. $3 523869
690 $a 0464
690 $a 0544
690 $a 0984
710 2 $a The Ohio State University. $b Electrical and Computer Engineering. $3 1672495
773 0 $t Dissertations Abstracts International $g 80-05B.
790 $a 0168
791 $a Ph.D.
792 $a 2018
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=11005532