東華大學圖書館 |

Randomized Methods in Optimization: Fast Algorithms and Fundamental Limits.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Randomized Methods in Optimization: Fast Algorithms and Fundamental Limits./
作者:	Lacotte, Jonathan Pierre.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	218 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
Contained By:	Dissertations Abstracts International83-05B.
標題:	Convex analysis. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28827871
ISBN:	9798494461308

Randomized Methods in Optimization: Fast Algorithms and Fundamental Limits.
Lacotte, Jonathan Pierre.

Randomized Methods in Optimization: Fast Algorithms and Fundamental Limits. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 218 p.

Source: Dissertations Abstracts International, Volume: 83-05, Section: B.

Thesis (Ph.D.)--Stanford University, 2021.

This item must not be sold to any third party vendors.

High-dimensional machine learning models trained on massive datasets have shown exceptional predictive ability over the last decade. Their adoption has created novel challenges and paradigms for practitioners and researchers. Typically, stochastic first-order methods have become the method of choice for training these models, due to their small per-iteration computational cost. Yet, these are known to be sensitive to hyperparameters' choices with weak or even no convergence guarantees. Alternative (e.g., second-order) optimization methods have not received the same amount of attention nor proved to be as effective in practice. A key bottleneck usually lies in their computational requirements, way more intensive than first-order methods. In this thesis, we are interested in random projections - also known as (a.k.a.) sketching or subsampling methods - in the context of large-scale optimization problems, which either involve massive datasets or high-dimensional optimization variables. The generic purpose of using random projections here is to reduce the dimensionality of the data matrix and/or the optimization variable, to obtain both faster convergence and smaller memory requirements. Naturally, random projections incur a loss of information, whence a loss in statistical performance. We consider the fundamental questions that arise when compressing information for faster computational procedures. What is the minimal embedding dimension one can use to project the data that guarantees convergence of an optimization solver? In fact, how to optimally couple random projections of the data with an optimization solver? And how to achieve optimal trade-offs between statistical and computational performance in terms of the choice of the random projection and the embedding dimension (a.k.a. sketch size)?To answer these questions, we will first focus on unconstrained convex quadratic objectives, and consider a broad class of first-order optimization solvers that involve a sketching-based preconditioner. We will explore and compare different random projections and classical first-order methods. We gradually extend our analysis and methodology to regularized convex quadratic optimization, and then to generic convex optimization problems. More concretely, the central contributions of this thesis are as follows: • carry out a theoretical and numerical comparative study of sketching-based preconditioned first-order methods for quadratic optimization that involve a dense data matrix, and prescribe a method of choice, • address the case of regularized convex quadratic objectives and design an adaptive procedure that automatically finds the optimal sketch size while preserving strong convergence guarantees, • extend our adaptive sketching-based optimization solvers to generic convex optimization problems, • address the case of high-dimensional optimization variables for generic convex optimization problems, and design a procedure that solves a low-dimensional optimization problem to approximate the high-dimensional optimal solution, • understand more finely the subspace embedding properties of certain classes of random projections, and develop the technical tools to analyze them.Finally, we will establish some precise connections between subsampling methods and neural network training, and will propose future research topics along that line.

ISBN: 9798494461308Subjects--Topical Terms:

3681761
Convex analysis.

Randomized Methods in Optimization: Fast Algorithms and Fundamental Limits.
LDR:04504nmm a2200349 4500 001 2346914
005 20220706051322.5
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798494461308
035 $a (MiAaPQ)AAI28827871
035 $a (MiAaPQ)STANFORDnd373pq4764
035 $a AAI28827871
040 $a MiAaPQ $c MiAaPQ
100 1 $a Lacotte, Jonathan Pierre. $3 3686118
245 1 0 $a Randomized Methods in Optimization: Fast Algorithms and Fundamental Limits.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 218 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
500 $a Advisor: Pilanci, Mert;Boyd, Stephen P. ;Wootters, Mary.
502 $a Thesis (Ph.D.)--Stanford University, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a High-dimensional machine learning models trained on massive datasets have shown exceptional predictive ability over the last decade. Their adoption has created novel challenges and paradigms for practitioners and researchers. Typically, stochastic first-order methods have become the method of choice for training these models, due to their small per-iteration computational cost. Yet, these are known to be sensitive to hyperparameters' choices with weak or even no convergence guarantees. Alternative (e.g., second-order) optimization methods have not received the same amount of attention nor proved to be as effective in practice. A key bottleneck usually lies in their computational requirements, way more intensive than first-order methods. In this thesis, we are interested in random projections - also known as (a.k.a.) sketching or subsampling methods - in the context of large-scale optimization problems, which either involve massive datasets or high-dimensional optimization variables. The generic purpose of using random projections here is to reduce the dimensionality of the data matrix and/or the optimization variable, to obtain both faster convergence and smaller memory requirements. Naturally, random projections incur a loss of information, whence a loss in statistical performance. We consider the fundamental questions that arise when compressing information for faster computational procedures. What is the minimal embedding dimension one can use to project the data that guarantees convergence of an optimization solver? In fact, how to optimally couple random projections of the data with an optimization solver? And how to achieve optimal trade-offs between statistical and computational performance in terms of the choice of the random projection and the embedding dimension (a.k.a. sketch size)?To answer these questions, we will first focus on unconstrained convex quadratic objectives, and consider a broad class of first-order optimization solvers that involve a sketching-based preconditioner. We will explore and compare different random projections and classical first-order methods. We gradually extend our analysis and methodology to regularized convex quadratic optimization, and then to generic convex optimization problems. More concretely, the central contributions of this thesis are as follows: • carry out a theoretical and numerical comparative study of sketching-based preconditioned first-order methods for quadratic optimization that involve a dense data matrix, and prescribe a method of choice, • address the case of regularized convex quadratic objectives and design an adaptive procedure that automatically finds the optimal sketch size while preserving strong convergence guarantees, • extend our adaptive sketching-based optimization solvers to generic convex optimization problems, • address the case of high-dimensional optimization variables for generic convex optimization problems, and design a procedure that solves a low-dimensional optimization problem to approximate the high-dimensional optimal solution, • understand more finely the subspace embedding properties of certain classes of random projections, and develop the technical tools to analyze them.Finally, we will establish some precise connections between subsampling methods and neural network training, and will propose future research topics along that line.
590 $a School code: 0212.
650 4 $a Convex analysis. $3 3681761
650 4 $a Algorithms. $3 536374
650 4 $a Network management systems. $3 3686119
650 4 $a Iterative methods. $3 3686120
650 4 $a Optimization. $3 891104
650 4 $a Neural networks. $3 677449
650 4 $a Artificial intelligence. $3 516317
650 4 $a Computer science. $3 523869
650 4 $a Information science. $3 554358
650 4 $a Mathematics. $3 515831
690 $a 0800
690 $a 0984
690 $a 0723
690 $a 0454
690 $a 0405
710 2 $a Stanford University. $3 754827
773 0 $t Dissertations Abstracts International $g 83-05B.
790 $a 0212
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28827871