東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Integration of content and context m...

Zhu, Qiusha.

FindBook

Google Book

Amazon

博客來

Integration of content and context modalities for multimedia big data retrieval.

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	Integration of content and context modalities for multimedia big data retrieval./
作者:	Zhu, Qiusha.
面頁冊數:	177 p.
附註:	Source: Dissertation Abstracts International, Volume: 75-10(E), Section: B.
Contained By:	Dissertation Abstracts International75-10B(E).
標題:	Engineering, Computer. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3627341
ISBN:	9781321025095

Integration of content and context modalities for multimedia big data retrieval.
Zhu, Qiusha.

Integration of content and context modalities for multimedia big data retrieval. - 177 p.

Source: Dissertation Abstracts International, Volume: 75-10(E), Section: B.

Thesis (Ph.D.)--University of Miami, 2014.

With the proliferation of digital photo-capture devices and the development of web technologies, the era of big data has arrived, which poses challenges to process and retrieve vast amounts of data with heterogeneous and diverse dimensionality. In the field of multimedia information retrieval, traditional keyword-based approaches perform well on text data, but it can hardly adapt to image and video due to the fact that a large proportion of this data nowadays is unorganized. This means the textual descriptions of images or videos, also known as metadata, could be unavailable, incomplete or even incorrect. Therefore, Content-Based Multimedia Information Retrieval (CBMIR) has emerged, which retrieves relevant images or videos by analyzing their visual content. Various data mining techniques such as feature selection, classification, clustering and filtering, have been utilized in CBMIR to solve issues involving data imbalance, data quality and size, limited ground truth, user subjectivity, etc. However, as an intrinsic problem of CBMIR, the semantic gap between low-level visual features and high-level semantics is still difficult to conquer. Now, with the rapid popularization of social media repositories, which allows users to upload images and videos, and assign tags to describe them, it has brought new directions as well as new challenges to the area of multimedia information retrieval. As suggested by the name, multimedia is a combination of different content forms that include text, audio, images, videos, etc. A series of research studies have been conducted to take advantage of one modality to compensate the other for various tasks.

ISBN: 9781321025095Subjects--Topical Terms:

1669061
Engineering, Computer.

Integration of content and context modalities for multimedia big data retrieval.
LDR:05525nam a2200301 4500 001 1963106
005 20140924121927.5
008 150210s2014 ||||||||||||||||| ||eng d
020 $a 9781321025095
035 $a (MiAaPQ)AAI3627341
035 $a AAI3627341
040 $a MiAaPQ $c MiAaPQ
100 1 $a Zhu, Qiusha. $3 2099282
245 1 0 $a Integration of content and context modalities for multimedia big data retrieval.
300 $a 177 p.
500 $a Source: Dissertation Abstracts International, Volume: 75-10(E), Section: B.
500 $a Adviser: Mei-Ling Shyu.
502 $a Thesis (Ph.D.)--University of Miami, 2014.
520 $a With the proliferation of digital photo-capture devices and the development of web technologies, the era of big data has arrived, which poses challenges to process and retrieve vast amounts of data with heterogeneous and diverse dimensionality. In the field of multimedia information retrieval, traditional keyword-based approaches perform well on text data, but it can hardly adapt to image and video due to the fact that a large proportion of this data nowadays is unorganized. This means the textual descriptions of images or videos, also known as metadata, could be unavailable, incomplete or even incorrect. Therefore, Content-Based Multimedia Information Retrieval (CBMIR) has emerged, which retrieves relevant images or videos by analyzing their visual content. Various data mining techniques such as feature selection, classification, clustering and filtering, have been utilized in CBMIR to solve issues involving data imbalance, data quality and size, limited ground truth, user subjectivity, etc. However, as an intrinsic problem of CBMIR, the semantic gap between low-level visual features and high-level semantics is still difficult to conquer. Now, with the rapid popularization of social media repositories, which allows users to upload images and videos, and assign tags to describe them, it has brought new directions as well as new challenges to the area of multimedia information retrieval. As suggested by the name, multimedia is a combination of different content forms that include text, audio, images, videos, etc. A series of research studies have been conducted to take advantage of one modality to compensate the other for various tasks.
520 $a A framework proposed in this dissertation focuses on integrating visual information and text information, which are referred to as the content and the context modalities respectively, for multimedia big data retrieval. The framework contains two components, namely MCA-based feature selection and sparse linear integration. First, a feature selection method based on Multiple Correspondence Analysis (MCA) is proposed to select features having high correlations with a given class since these features can provide more discriminative information when predicting class labels. This is especially useful for the context modality since the tags assigned to the images or videos by users are known to be very noisy. Selecting discriminative tags can not only remove noise but also reduce feature dimensions. Considering MCA is a technique used to analyze nominal features, a discretization method based on MCA is developed accordingly to handle numeric features. Then the sparse linear integration component takes the selected features from modalities as the inputs and builds a model that learns a pairwise instance similarity matrix. An optimization problem is formulated to minimize the differences between the similarity matrix generated from the context modality and the differences between the similarity matrix generated from the content modality. Coordinate descent and soft-thresholding can be applied to solve the problem. Compared to the existing approaches, the proposed framework is able to handle noisy and high dimensional features in each of the modalities. Feature correlations are taken into account and no local decision or handcrafted structure is required. The methods presented in this framework can be carried out in parallel, thus parallel and distributed programming framework, such as MapReduce, can be adopted to improve the computing capacity and scale to very large data sets. In the experiment, multiple public benchmark data sets, including collections of images and videos, are used to evaluate each of the components. Comparison with some existing popular approaches verifies the effectiveness of the proposed methods for the task of semantic concept retrieval.
520 $a Two applications using the proposed methods for content-based recommender systems are presented. The first one uses the sparse linear integration model to find similar items by considering the information from both images and their metadata. Experiment and subjective evaluation are conducted on a self-collected bag data set for online shopping recommendations. The second one employs a topic model to the features extracted from videos and their metadata to determine topics in an unified manner. This application recommends movies with similar distributions in textual topics and visual topics to the users. Benchmark MovieLens1M data set is used for evaluation. Several research directions are identified to improve the framework for various practical challenges.
590 $a School code: 0125.
650 4 $a Engineering, Computer. $3 1669061
650 4 $a Information Technology. $3 1030799
690 $a 0464
690 $a 0489
710 2 $a University of Miami. $b Electrical and Computer Engineering (Engineering). $3 2092125
773 0 $t Dissertation Abstracts International $g 75-10B(E).
790 $a 0125
791 $a Ph.D.
792 $a 2014
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3627341