語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Diffusion-based Approaches to Visual...
~
Gigante, Scott Anthony.
FindBook
Google Book
Amazon
博客來
Diffusion-based Approaches to Visualization and Exploration of High-Dimensional Data.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Diffusion-based Approaches to Visualization and Exploration of High-Dimensional Data./
作者:
Gigante, Scott Anthony.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:
236 p.
附註:
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Contained By:
Dissertations Abstracts International83-02B.
標題:
Applied mathematics. -
電子資源:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28322353
ISBN:
9798534652307
Diffusion-based Approaches to Visualization and Exploration of High-Dimensional Data.
Gigante, Scott Anthony.
Diffusion-based Approaches to Visualization and Exploration of High-Dimensional Data.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 236 p.
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Thesis (Ph.D.)--Yale University, 2021.
This item must not be sold to any third party vendors.
In recent years, modern technologies have enabled the collection of exponentially larger quantities of data in the biomedical domain and elsewhere. In particular, the advent of single-cell genomics has allowed for the collection of datasets containing hundreds of thousands of cells measured in tens of thousands of dimensions. This rapid expansion of common datasets beyond the possibility of manual annotation brings forth the need for large-scale exploratory data analysis. In this thesis, we will explore the problem of dimensionality reduction for visualization of high-dimensional datasets. Visualization of high-dimensional data is an essential task in exploratory data analysis, as the low-dimensional visualization of the data is used to understand, interrogate and present the results of many other analyses applied to the data. However, the repertoire of existing algorithms used for this task suffer from various algorithmic flaws leading to sub-optimal visualizations, including the trade-off between representing both local and global structure; the inherent sacrifices that must be made to reduce a dataset of intrinsic dimension greater than three to a form which can be interpreted by the human eye; and the computational complexity of the computations as the datasets increase in scale. Here, we use the framework provided by diffusion maps to present a new dimensionality reduction algorithm called PHATE, which seeks to address all three of these issues. In order to make the PHATE algorithm scalable, we present an approximation of the diffusion map through discrete partitions of the data called Compression-based Fast Diffusion Maps. Further, we use the insights gained from visualizing single-cell genomics data to present a manifold alignment algorithm called Harmonic Alignment, which allows for the correction of systemic differences between experiments, or the fusion of datasets collected from the same biological system using different assays. And finally, we present an extension of PHATE to longitudinal data, and demonstrate its utility for the purpose of machine learning interpretability by visualizing the hidden units of a neural network in training. While many open problems remain, the presentation of the methods herein chart a path towards a more systematic understanding of how we visualize high-dimensional data for exploratory data analysis.
ISBN: 9798534652307Subjects--Topical Terms:
2122814
Applied mathematics.
Subjects--Index Terms:
Big data
Diffusion-based Approaches to Visualization and Exploration of High-Dimensional Data.
LDR
:03790nmm a2200457 4500
001
2281932
005
20210927083432.5
008
220723s2021 ||||||||||||||||| ||eng d
020
$a
9798534652307
035
$a
(MiAaPQ)AAI28322353
035
$a
AAI28322353
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Gigante, Scott Anthony.
$3
3560638
245
1 0
$a
Diffusion-based Approaches to Visualization and Exploration of High-Dimensional Data.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
236 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
500
$a
Advisor: Krishnaswamy, Smita;Coifman, Ronald R.
502
$a
Thesis (Ph.D.)--Yale University, 2021.
506
$a
This item must not be sold to any third party vendors.
520
$a
In recent years, modern technologies have enabled the collection of exponentially larger quantities of data in the biomedical domain and elsewhere. In particular, the advent of single-cell genomics has allowed for the collection of datasets containing hundreds of thousands of cells measured in tens of thousands of dimensions. This rapid expansion of common datasets beyond the possibility of manual annotation brings forth the need for large-scale exploratory data analysis. In this thesis, we will explore the problem of dimensionality reduction for visualization of high-dimensional datasets. Visualization of high-dimensional data is an essential task in exploratory data analysis, as the low-dimensional visualization of the data is used to understand, interrogate and present the results of many other analyses applied to the data. However, the repertoire of existing algorithms used for this task suffer from various algorithmic flaws leading to sub-optimal visualizations, including the trade-off between representing both local and global structure; the inherent sacrifices that must be made to reduce a dataset of intrinsic dimension greater than three to a form which can be interpreted by the human eye; and the computational complexity of the computations as the datasets increase in scale. Here, we use the framework provided by diffusion maps to present a new dimensionality reduction algorithm called PHATE, which seeks to address all three of these issues. In order to make the PHATE algorithm scalable, we present an approximation of the diffusion map through discrete partitions of the data called Compression-based Fast Diffusion Maps. Further, we use the insights gained from visualizing single-cell genomics data to present a manifold alignment algorithm called Harmonic Alignment, which allows for the correction of systemic differences between experiments, or the fusion of datasets collected from the same biological system using different assays. And finally, we present an extension of PHATE to longitudinal data, and demonstrate its utility for the purpose of machine learning interpretability by visualizing the hidden units of a neural network in training. While many open problems remain, the presentation of the methods herein chart a path towards a more systematic understanding of how we visualize high-dimensional data for exploratory data analysis.
590
$a
School code: 0265.
650
4
$a
Applied mathematics.
$3
2122814
650
4
$a
Computer science.
$3
523869
650
4
$a
Bioinformatics.
$3
553671
650
4
$a
Biomedical engineering.
$3
535387
650
4
$a
Artificial intelligence.
$3
516317
650
4
$a
Information science.
$3
554358
650
4
$a
DNA methylation.
$3
3560639
650
4
$a
Datasets.
$3
3541416
650
4
$a
Signal processing.
$3
533904
650
4
$a
Approximation.
$3
3560410
650
4
$a
Data analysis.
$2
bisacsh
$3
3515250
650
4
$a
Genomes.
$3
592593
650
4
$a
Biology.
$3
522710
650
4
$a
Harmonic analysis.
$3
555704
650
4
$a
Genomics.
$3
600531
650
4
$a
Bibliographic literature.
$3
3560640
650
4
$a
Visualization.
$3
586179
650
4
$a
Gene expression.
$3
643979
650
4
$a
Experiments.
$3
525909
650
4
$a
Neural networks.
$3
677449
650
4
$a
Maps.
$3
544078
650
4
$a
Design.
$3
518875
650
4
$a
Diffusion.
$3
536283
650
4
$a
Algorithms.
$3
536374
650
4
$a
Biotechnology.
$3
571461
653
$a
Big data
653
$a
Data visualization
653
$a
Dimensionality reduction
653
$a
Machine learning
653
$a
PHATE
653
$a
Single-cell genomics
653
$a
Potential of Heat-diffusion for Affinity-based Transition Embedding
690
$a
0364
690
$a
0984
690
$a
0715
690
$a
0541
690
$a
0800
690
$a
0723
690
$a
0389
690
$a
0306
710
2
$a
Yale University.
$b
Computational Biology and Bioinformatics.
$3
3549849
773
0
$t
Dissertations Abstracts International
$g
83-02B.
790
$a
0265
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28322353
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9433665
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入