東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

FindBook

Google Book

Amazon

博客來

Towards Object Detection in the Real World.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Towards Object Detection in the Real World./
作者:	Zhu, Chenchen.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	103 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Contained By:	Dissertations Abstracts International83-03B.
標題:	Computer engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28650014
ISBN:	9798538139101

Towards Object Detection in the Real World.
Zhu, Chenchen.

Towards Object Detection in the Real World. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 103 p.

Source: Dissertations Abstracts International, Volume: 83-03, Section: B.

Thesis (Ph.D.)--Carnegie Mellon University, 2021.

This item must not be sold to any third party vendors.

Object detection is one of the most fundamental tasks in the computer vision field, which aims at localizing and classifying instances of semantic objects of certain classes in digital images. Object detection serves as a crucial step for many downstream vision tasks such as action recognition, face analysis, instance segmentation, object re-identification, retail scene understanding, etc. Therefore, it has been carefully studied by the computer vision community for decades. Thanks to the advance of deep neural networks and well-annotated challenging datasets, object detection algorithms have been greatly improved. However, object detectors are still far from robust when deployed in real-world AI applications. The performance can drop dramatically due to the challenging conditions introduced by the varying nature of the real-world data. We summarize the majority of this varying nature as three aspects, i.e. appearance variation, scale variation, and availability variation. In some extreme cases where multiple variations co-exist, the failure of object detectors may even lead to the crash of the entire AI system.The focus of this thesis is to construct the solutions addressing the mentioned three types of data variations. For the appearance variation, we study the effect of the context information on the detection of the human face, one of the most common objects. We propose an explicit contextual reasoning module for the detection network to capture the local information surrounding the face. For the scale variation challenge, we start with the anchor-based formulation of object detection where the anchor-object matching mechanism is theoretically investigated. This inspires us to propose several better designs of robust anchors. Then we discover the inherent limitations of anchor-based detection, leading to the reformulation of detection from an anchor-free perspective. Advanced techniques for dynamic feature selection are proposed to achieve the goal that less is more. For the availability variation, we address the inherent long-tail distribution of the real-world data by studying object detection in the few-shot setting in which there are some rare classes with only a few annotated objects available while other common classes dominate the dataset with abundant labeled samples. Given limited visual information of the rare classes, we propose semantic relation reasoning with prior knowledge from natural language to take advantage of the constant relationship between common classes and rare classes regardless of the data availability. We thoroughly analyze the effect of proposed techniques by conducting several experiments on challenging real-world datasets, such as WiderFace, VOC, COCO, etc. Comparisons with the previous state of the arts demonstrate the superiority of our methods.

ISBN: 9798538139101Subjects--Topical Terms:

621879
Computer engineering.
Subjects--Index Terms:

Object detection

Towards Object Detection in the Real World.
LDR:03900nmm a2200349 4500 001 2349590
005 20230509091121.5
006 m o d
007 cr#unu||||||||
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798538139101
035 $a (MiAaPQ)AAI28650014
035 $a AAI28650014
040 $a MiAaPQ $c MiAaPQ
100 1 $a Zhu, Chenchen. $3 3689000
245 1 0 $a Towards Object Detection in the Real World.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 103 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
500 $a Advisor: Savvides, Marios.
502 $a Thesis (Ph.D.)--Carnegie Mellon University, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a Object detection is one of the most fundamental tasks in the computer vision field, which aims at localizing and classifying instances of semantic objects of certain classes in digital images. Object detection serves as a crucial step for many downstream vision tasks such as action recognition, face analysis, instance segmentation, object re-identification, retail scene understanding, etc. Therefore, it has been carefully studied by the computer vision community for decades. Thanks to the advance of deep neural networks and well-annotated challenging datasets, object detection algorithms have been greatly improved. However, object detectors are still far from robust when deployed in real-world AI applications. The performance can drop dramatically due to the challenging conditions introduced by the varying nature of the real-world data. We summarize the majority of this varying nature as three aspects, i.e. appearance variation, scale variation, and availability variation. In some extreme cases where multiple variations co-exist, the failure of object detectors may even lead to the crash of the entire AI system.The focus of this thesis is to construct the solutions addressing the mentioned three types of data variations. For the appearance variation, we study the effect of the context information on the detection of the human face, one of the most common objects. We propose an explicit contextual reasoning module for the detection network to capture the local information surrounding the face. For the scale variation challenge, we start with the anchor-based formulation of object detection where the anchor-object matching mechanism is theoretically investigated. This inspires us to propose several better designs of robust anchors. Then we discover the inherent limitations of anchor-based detection, leading to the reformulation of detection from an anchor-free perspective. Advanced techniques for dynamic feature selection are proposed to achieve the goal that less is more. For the availability variation, we address the inherent long-tail distribution of the real-world data by studying object detection in the few-shot setting in which there are some rare classes with only a few annotated objects available while other common classes dominate the dataset with abundant labeled samples. Given limited visual information of the rare classes, we propose semantic relation reasoning with prior knowledge from natural language to take advantage of the constant relationship between common classes and rare classes regardless of the data availability. We thoroughly analyze the effect of proposed techniques by conducting several experiments on challenging real-world datasets, such as WiderFace, VOC, COCO, etc. Comparisons with the previous state of the arts demonstrate the superiority of our methods.
590 $a School code: 0041.
650 4 $a Computer engineering. $3 621879
650 4 $a Visualization. $3 586179
650 4 $a Sensors. $3 3549539
650 4 $a Neural networks. $3 677449
650 4 $a Semantics. $3 520060
650 4 $a Artificial intelligence. $3 516317
653 $a Object detection
653 $a Real world
690 $a 0464
690 $a 0800
710 2 $a Carnegie Mellon University. $b Electrical and Computer Engineering. $3 2094139
773 0 $t Dissertations Abstracts International $g 83-03B.
790 $a 0041
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28650014