東華大學圖書館 |

Large-Scale Semi-Supervised Learning in Visual Recognition.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Large-Scale Semi-Supervised Learning in Visual Recognition./
作者:	Yang, Lei.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2020,
面頁冊數:	154 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Contained By:	Dissertations Abstracts International83-03B.
標題:	Computer engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28736044
ISBN:	9798535514529

Large-Scale Semi-Supervised Learning in Visual Recognition.
Yang, Lei.

Large-Scale Semi-Supervised Learning in Visual Recognition. - Ann Arbor : ProQuest Dissertations & Theses, 2020 - 154 p.

Source: Dissertations Abstracts International, Volume: 83-03, Section: B.

Thesis (Ph.D.)--Hong Kong University of Science and Technology (Hong Kong), 2020.

Semi-supervised learning has always been an important research topic in both machine learning and computer vision. Although deep learning has achieved great success in recent years, its performance depends on enormous labeled data. Although we can easily collect a large amount of unlabeled data on the Internet, annotating them is extremely time-consuming and expensive. Semi-supervised learning aims to leverage a small amount of labeled data together with a large amount of unlabeled data. Recent studies have shown that a well-designed semi-supervised learning method can improve the performance of visual recognition while significantly reducing the cost of annotation. Despite obtaining encouraging results, most existing works rely on three important assumptions: (1) no modality gap lies between unlabeled data and labeled data; (2) only a small amount of noise exists in unlabeled data; (3) unlabeled data should belong to a certain category in labeled data. Yet, these assumptions do not always hold in large-scale real-world settings. On one hand, there is no guarantee that labeled data and unlabeled data are from the same modality. On the other hand, the collection process of unlabeled data inevitably introduces a lot of unknown noise. In addition, unlabeled data do not necessarily share the same classes with the labeled ones. Violation of any assumptions above may severely deteriorate the performance of existing approaches. Hence, this thesis attempts to tackle these three challenges in large-scale semi-supervised learning.This paper will be divided into the following four parts. The first two parts focus on solving the modality gap and unknown noise in close-set semi-supervised learning, where unlabeled data and labeled data share the same categories. The second two parts focus on proposing effective solutions in open-set semi-supervised learning, where the categories of unlabeled data and labeled data are not necessary to be the same.In the first part of this thesis, in order to deal with the modality gap between unlabeled data and labeled ones, we propose a cross-modality matching algorithm. The algorithm focuses on coping with the problems caused by two imbalanced modalities, that is, the information of one modality is significantly weaker than that of the other, and there is no one-to-one corresponding data in these two modalities. Specifically, we propose a new framework for cross-modality recognition, which derives a conditional distribution that bridges both modalities via adversarial learning. This framework generates a conditional distribution of the features in the strong modality feature space based on the features of the weak modality sample, which effectively turn one-to-one matching to many-to-one matching. Experiments on several datasets demonstrate the effectiveness of the proposed framework.In the second part of this thesis, in order to deal with the problem of noise in unlabeled data, we propose a framework for reliable label propagation on noisy graphs. This framework incorporates (1) a local graph neural network to predict accurately on varying local structures while maintaining high scalability and (2) a patch-based path scheduler that moves forward the propagation frontier in a prudent way. These components are closely coupled and learned end-to-end. Experiments showed that our method can significantly improve the reliability and accuracy of label propagation when there is a lot of noisy data in the unlabeled data. This work was published at the European Computer Vision Conference (ECCV) 2020.In the third part of this thesis, in order to deal with the difference of categories between unlabeled data and labeled data, we propose a set of supervised clustering framework. Compared with the traditional rule-based unsupervised clustering algorithms, our method is a supervised clustering algorithm, which is fundamentally different from previous approaches. Our method first defines clustering as a detection and segmentation paradigm, and leverages graph convolutional neural networks to capture common cluster patterns from data. The pseudo labels of the unlabeled data obtained through clustering can be used in a supervised manner to further improve the performance of the recognition model. This part of the work was published at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019. In the fourth part of this thesis, we optimized the supervised clustering framework from efficiency and accuracy. Although supervised clustering has been demonstrated to improve the performance of clustering effectively, they usually organize the data as a large number of subgraphs, leading to two important problems. On the one hand, the generation or aggregation of subgraphs depends on hand-crafted algorithms, which limits the upper bound of the learnable clustering. On the other hand, the subgraphs are often highly overlapping, resulting in a lot of computational redundancy. Therefore, these methods still have room for improvement in both accuracy and efficiency. We decompose the clustering problem as an estimation of vertex confidence and edge connectivity, and leverage graph convolutional neural networks to automatically capture some common patterns in vertices and edges from the data. This method not only greatly reduces the number of subgraphs, but also removes the limitation of the heuristic algorithm, achieving higher efficiency and accuracy. This part of the work has been published at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020.We have designed a large number of experiments to evaluate the effectiveness of the proposed algorithm in challenging large-scale real-world settings, and compared with other representative methods systematically. Experimental results demonstrate that the proposed algorithms can enhance the reliability, efficiency, and accuracy of both closed-set and open-set semi-supervised learning in large-scale settings.

ISBN: 9798535514529Subjects--Topical Terms:

621879
Computer engineering.
Subjects--Index Terms:

Semi-supervised learning

Large-Scale Semi-Supervised Learning in Visual Recognition.
LDR:07116nmm a2200373 4500 001 2344679
005 20220531064624.5
008 241004s2020 ||||||||||||||||| ||eng d
020 $a 9798535514529
035 $a (MiAaPQ)AAI28736044
035 $a AAI28736044
040 $a MiAaPQ $c MiAaPQ
100 1 $a Yang, Lei. $3 1270011
245 1 0 $a Large-Scale Semi-Supervised Learning in Visual Recognition.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2020
300 $a 154 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
500 $a Advisor: Lin, Dahua.
502 $a Thesis (Ph.D.)--Hong Kong University of Science and Technology (Hong Kong), 2020.
520 $a Semi-supervised learning has always been an important research topic in both machine learning and computer vision. Although deep learning has achieved great success in recent years, its performance depends on enormous labeled data. Although we can easily collect a large amount of unlabeled data on the Internet, annotating them is extremely time-consuming and expensive. Semi-supervised learning aims to leverage a small amount of labeled data together with a large amount of unlabeled data. Recent studies have shown that a well-designed semi-supervised learning method can improve the performance of visual recognition while significantly reducing the cost of annotation. Despite obtaining encouraging results, most existing works rely on three important assumptions: (1) no modality gap lies between unlabeled data and labeled data; (2) only a small amount of noise exists in unlabeled data; (3) unlabeled data should belong to a certain category in labeled data. Yet, these assumptions do not always hold in large-scale real-world settings. On one hand, there is no guarantee that labeled data and unlabeled data are from the same modality. On the other hand, the collection process of unlabeled data inevitably introduces a lot of unknown noise. In addition, unlabeled data do not necessarily share the same classes with the labeled ones. Violation of any assumptions above may severely deteriorate the performance of existing approaches. Hence, this thesis attempts to tackle these three challenges in large-scale semi-supervised learning.This paper will be divided into the following four parts. The first two parts focus on solving the modality gap and unknown noise in close-set semi-supervised learning, where unlabeled data and labeled data share the same categories. The second two parts focus on proposing effective solutions in open-set semi-supervised learning, where the categories of unlabeled data and labeled data are not necessary to be the same.In the first part of this thesis, in order to deal with the modality gap between unlabeled data and labeled ones, we propose a cross-modality matching algorithm. The algorithm focuses on coping with the problems caused by two imbalanced modalities, that is, the information of one modality is significantly weaker than that of the other, and there is no one-to-one corresponding data in these two modalities. Specifically, we propose a new framework for cross-modality recognition, which derives a conditional distribution that bridges both modalities via adversarial learning. This framework generates a conditional distribution of the features in the strong modality feature space based on the features of the weak modality sample, which effectively turn one-to-one matching to many-to-one matching. Experiments on several datasets demonstrate the effectiveness of the proposed framework.In the second part of this thesis, in order to deal with the problem of noise in unlabeled data, we propose a framework for reliable label propagation on noisy graphs. This framework incorporates (1) a local graph neural network to predict accurately on varying local structures while maintaining high scalability and (2) a patch-based path scheduler that moves forward the propagation frontier in a prudent way. These components are closely coupled and learned end-to-end. Experiments showed that our method can significantly improve the reliability and accuracy of label propagation when there is a lot of noisy data in the unlabeled data. This work was published at the European Computer Vision Conference (ECCV) 2020.In the third part of this thesis, in order to deal with the difference of categories between unlabeled data and labeled data, we propose a set of supervised clustering framework. Compared with the traditional rule-based unsupervised clustering algorithms, our method is a supervised clustering algorithm, which is fundamentally different from previous approaches. Our method first defines clustering as a detection and segmentation paradigm, and leverages graph convolutional neural networks to capture common cluster patterns from data. The pseudo labels of the unlabeled data obtained through clustering can be used in a supervised manner to further improve the performance of the recognition model. This part of the work was published at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019. In the fourth part of this thesis, we optimized the supervised clustering framework from efficiency and accuracy. Although supervised clustering has been demonstrated to improve the performance of clustering effectively, they usually organize the data as a large number of subgraphs, leading to two important problems. On the one hand, the generation or aggregation of subgraphs depends on hand-crafted algorithms, which limits the upper bound of the learnable clustering. On the other hand, the subgraphs are often highly overlapping, resulting in a lot of computational redundancy. Therefore, these methods still have room for improvement in both accuracy and efficiency. We decompose the clustering problem as an estimation of vertex confidence and edge connectivity, and leverage graph convolutional neural networks to automatically capture some common patterns in vertices and edges from the data. This method not only greatly reduces the number of subgraphs, but also removes the limitation of the heuristic algorithm, achieving higher efficiency and accuracy. This part of the work has been published at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020.We have designed a large number of experiments to evaluate the effectiveness of the proposed algorithm in challenging large-scale real-world settings, and compared with other representative methods systematically. Experimental results demonstrate that the proposed algorithms can enhance the reliability, efficiency, and accuracy of both closed-set and open-set semi-supervised learning in large-scale settings.
590 $a School code: 1223.
650 4 $a Computer engineering. $3 621879
650 4 $a Computer science. $3 523869
650 4 $a Information technology. $3 532993
650 4 $a Artificial intelligence. $3 516317
650 4 $a Information science. $3 554358
650 4 $a Accuracy. $3 3559958
650 4 $a Active learning. $3 527777
650 4 $a Internet. $3 527226
650 4 $a Experiments. $3 525909
650 4 $a Neural networks. $3 677449
650 4 $a Confidence. $3 682645
650 4 $a Noise. $3 598816
650 4 $a Methods. $3 3560391
650 4 $a Algorithms. $3 536374
650 4 $a Clustering. $3 3559215
650 4 $a Ablation. $3 3562462
653 $a Semi-supervised learning
653 $a Visual recognition
653 $a Computer vision
653 $a Machine learning
690 $a 0489
690 $a 0984
690 $a 0464
690 $a 0723
690 $a 0800
710 2 $a Hong Kong University of Science and Technology (Hong Kong). $3 1022235
773 0 $t Dissertations Abstracts International $g 83-03B.
790 $a 1223
791 $a Ph.D.
792 $a 2020
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28736044