Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Linked to FindBook
Google Book
Amazon
博客來
Convex Optimization for Neural Networks.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Convex Optimization for Neural Networks./
Author:
Ergen, Tolga.
Description:
1 online resource (266 pages)
Notes:
Source: Dissertations Abstracts International, Volume: 85-03, Section: B.
Contained By:
Dissertations Abstracts International85-03B.
Subject:
Neural networks. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30561729click for full text (PQDT)
ISBN:
9798380264563
Convex Optimization for Neural Networks.
Ergen, Tolga.
Convex Optimization for Neural Networks.
- 1 online resource (266 pages)
Source: Dissertations Abstracts International, Volume: 85-03, Section: B.
Thesis (Ph.D.)--Stanford University, 2023.
Includes bibliographical references
Due to the non-convex nature of training Deep Neural Network (DNN) models, their effectiveness relies on the use of non-convex optimization heuristics. Traditional methods for training DNNs often require costly empirical methods to produce successful models and do not have a clear theoretical foundation. In this thesis, we examine the use of convex optimization theory to improve the training of neural networks and provide a better interpretation of their optimal weights. In this thesis, we focus on two-layer neural networks with piecewise linear activations and show that they can be formulated as finite-dimensional convex programs with a regularization term that promotes sparsity, which is a variant of group Lasso. We first utilize semi-infinite programming theory to prove strong duality for finite-width neural networks and then describe these architectures equivalently as high dimensional convex models. Remarkably, the worst-case complexity to solve the convex program is polynomial in the number of samples and number of neurons when the rank of the data matrix is bounded, which is the case in convolutional networks. To extend our method to training data of arbitrary rank, we develop a novel polynomial-time approximation scheme based on zonotope subsampling that comes with a guaranteed approximation ratio. Our convex models can be trained using standard convex solvers without resorting to heuristics or extensive hyper-parameter tuning unlike non-convex methods. Due to the convexity, optimizer hyperparameters such as initialization, batch sizes, and step size schedules have no effect on the final model. Through extensive numerical experiments, we show that convex models can outperform traditional non-convex methods and are not sensitive to optimizer hyperparameters.In the remaining parts of the thesis, we first extend the analysis to certain standard two and three-layer Convolutional Neural Networks (CNNs) can be globally optimized in fully polynomial time. Unlike the fully connected networks studied in the first part, we prove that these equivalent characterizations of CNNs have fully polynomial complexity in all input dimensions without resorting to any approximation techniques and therefore enjoy significant computational complexity improvements. We then discuss extensions of our convex analysis to various neural network architectures including vector output networks, batch normalization, Generative Adversarial Networks (GANs), deeper architectures, and threshold networks.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023
Mode of access: World Wide Web
ISBN: 9798380264563Subjects--Topical Terms:
677449
Neural networks.
Index Terms--Genre/Form:
542853
Electronic books.
Convex Optimization for Neural Networks.
LDR
:03721nmm a2200325K 4500
001
2363228
005
20231116093831.5
006
m o d
007
cr mn ---uuuuu
008
241011s2023 xx obm 000 0 eng d
020
$a
9798380264563
035
$a
(MiAaPQ)AAI30561729
035
$a
(MiAaPQ)STANFORDsv935mh9248
035
$a
AAI30561729
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Ergen, Tolga.
$3
3703982
245
1 0
$a
Convex Optimization for Neural Networks.
264
0
$c
2023
300
$a
1 online resource (266 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 85-03, Section: B.
500
$a
Advisor: Pilanci, Mert;Weissman, Tsachy;Boyd, Stephen.
502
$a
Thesis (Ph.D.)--Stanford University, 2023.
504
$a
Includes bibliographical references
520
$a
Due to the non-convex nature of training Deep Neural Network (DNN) models, their effectiveness relies on the use of non-convex optimization heuristics. Traditional methods for training DNNs often require costly empirical methods to produce successful models and do not have a clear theoretical foundation. In this thesis, we examine the use of convex optimization theory to improve the training of neural networks and provide a better interpretation of their optimal weights. In this thesis, we focus on two-layer neural networks with piecewise linear activations and show that they can be formulated as finite-dimensional convex programs with a regularization term that promotes sparsity, which is a variant of group Lasso. We first utilize semi-infinite programming theory to prove strong duality for finite-width neural networks and then describe these architectures equivalently as high dimensional convex models. Remarkably, the worst-case complexity to solve the convex program is polynomial in the number of samples and number of neurons when the rank of the data matrix is bounded, which is the case in convolutional networks. To extend our method to training data of arbitrary rank, we develop a novel polynomial-time approximation scheme based on zonotope subsampling that comes with a guaranteed approximation ratio. Our convex models can be trained using standard convex solvers without resorting to heuristics or extensive hyper-parameter tuning unlike non-convex methods. Due to the convexity, optimizer hyperparameters such as initialization, batch sizes, and step size schedules have no effect on the final model. Through extensive numerical experiments, we show that convex models can outperform traditional non-convex methods and are not sensitive to optimizer hyperparameters.In the remaining parts of the thesis, we first extend the analysis to certain standard two and three-layer Convolutional Neural Networks (CNNs) can be globally optimized in fully polynomial time. Unlike the fully connected networks studied in the first part, we prove that these equivalent characterizations of CNNs have fully polynomial complexity in all input dimensions without resorting to any approximation techniques and therefore enjoy significant computational complexity improvements. We then discuss extensions of our convex analysis to various neural network architectures including vector output networks, batch normalization, Generative Adversarial Networks (GANs), deeper architectures, and threshold networks.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2023
538
$a
Mode of access: World Wide Web
650
4
$a
Neural networks.
$3
677449
655
7
$a
Electronic books.
$2
lcsh
$3
542853
690
$a
0800
710
2
$a
ProQuest Information and Learning Co.
$3
783688
710
2
$a
Stanford University.
$3
754827
773
0
$t
Dissertations Abstracts International
$g
85-03B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=30561729
$z
click for full text (PQDT)
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9485584
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login