東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

FindBook

Google Book

Amazon

博客來

Joint Reasoning for Camera and 3D Human Pose Estimation.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Joint Reasoning for Camera and 3D Human Pose Estimation./
作者:	Xu, Yan.
面頁冊數:	1 online resource (136 pages)
附註:	Source: Dissertations Abstracts International, Volume: 84-07, Section: B.
Contained By:	Dissertations Abstracts International84-07B.
標題:	Computer science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30000857click for full text (PQDT)
ISBN:	9798368420783

Joint Reasoning for Camera and 3D Human Pose Estimation.
Xu, Yan.

Joint Reasoning for Camera and 3D Human Pose Estimation. - 1 online resource (136 pages)

Source: Dissertations Abstracts International, Volume: 84-07, Section: B.

Thesis (Ph.D.)--Carnegie Mellon University, 2022.

Includes bibliographical references

Estimating the 6-DoF camera pose and the 3D human pose lies at the core of many computer vision tasks, such as virtual reality, augmented reality, and human-robot interaction. Existing efforts either rely on large amounts of 3D training data for each new scene or require strong prior knowledge, e.g., known camera poses, only available in laboratory environments. Despite the improvements in the numbers on a few public datasets, the gap between laboratory research and real-world applications remains. The objective of this thesis is to develop camera and human pose estimation methods that can bridge this gap.This thesis includes two parts. The first part focuses on camera pose estimation using human information. We first introduce a single-view camera pose estimation method that uses a lightweight network trained only on synthetic 2D human trajectory data to directly regress the camera pose at test using real human trajectories. After that, we present a wide-baseline multi-view camera pose estimation method that treats humans as key points and uses a re-ID network pre-trained on public datasets to embed human features for solving cross-view matching. We show that both methods do not require 3D data collection and annotation and generalize to new scenarios without extra effort.The second part of this thesis concentrates on multi-view multi-person 3D human pose estimation targeting the challenging setting where the camera poses are unknown. We present a method that follows the detection-matching-reconstruction process and treats the cross-view matching as a clustering problem with the number of humans and cameras as constraints. Compared with existing methods, ours is one of the first that does not require camera poses, 3D data collection, or model training for each specific dataset. Next, we further improve the method by introducing a multi-step clustering mechanism and leveraging short-term single-view tracking to boost cross-view matching performance. Our method shows excellent generalization ability across various in-the-wild settings.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023

Mode of access: World Wide Web

ISBN: 9798368420783Subjects--Topical Terms:

523869
Computer science.
Subjects--Index Terms:

3D human pose estimationIndex Terms--Genre/Form:

542853
Electronic books.

Joint Reasoning for Camera and 3D Human Pose Estimation.
LDR:03360nmm a2200361K 4500 001 2358367
005 20230731112631.5
006 m o d
007 cr mn ---uuuuu
008 241011s2022 xx obm 000 0 eng d
020 $a 9798368420783
035 $a (MiAaPQ)AAI30000857
035 $a AAI30000857
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Xu, Yan. $3 1266985
245 1 0 $a Joint Reasoning for Camera and 3D Human Pose Estimation.
264 0 $c 2022
300 $a 1 online resource (136 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 84-07, Section: B.
500 $a Advisor: Kitani, Kris.
502 $a Thesis (Ph.D.)--Carnegie Mellon University, 2022.
504 $a Includes bibliographical references
520 $a Estimating the 6-DoF camera pose and the 3D human pose lies at the core of many computer vision tasks, such as virtual reality, augmented reality, and human-robot interaction. Existing efforts either rely on large amounts of 3D training data for each new scene or require strong prior knowledge, e.g., known camera poses, only available in laboratory environments. Despite the improvements in the numbers on a few public datasets, the gap between laboratory research and real-world applications remains. The objective of this thesis is to develop camera and human pose estimation methods that can bridge this gap.This thesis includes two parts. The first part focuses on camera pose estimation using human information. We first introduce a single-view camera pose estimation method that uses a lightweight network trained only on synthetic 2D human trajectory data to directly regress the camera pose at test using real human trajectories. After that, we present a wide-baseline multi-view camera pose estimation method that treats humans as key points and uses a re-ID network pre-trained on public datasets to embed human features for solving cross-view matching. We show that both methods do not require 3D data collection and annotation and generalize to new scenarios without extra effort.The second part of this thesis concentrates on multi-view multi-person 3D human pose estimation targeting the challenging setting where the camera poses are unknown. We present a method that follows the detection-matching-reconstruction process and treats the cross-view matching as a clustering problem with the number of humans and cameras as constraints. Compared with existing methods, ours is one of the first that does not require camera poses, 3D data collection, or model training for each specific dataset. Next, we further improve the method by introducing a multi-step clustering mechanism and leveraging short-term single-view tracking to boost cross-view matching performance. Our method shows excellent generalization ability across various in-the-wild settings.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2023
538 $a Mode of access: World Wide Web
650 4 $a Computer science. $3 523869
653 $a 3D human pose estimation
653 $a Camera pose estimation
653 $a Multi-view multi-person
655 7 $a Electronic books. $2 lcsh $3 542853
690 $a 0984
690 $a 0800
710 2 $a ProQuest Information and Learning Co. $3 783688
710 2 $a Carnegie Mellon University. $b Electrical and Computer Engineering. $3 2094139
773 0 $t Dissertations Abstracts International $g 84-07B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30000857 $z click for full text (PQDT)