東華大學圖書館 |

Facets of Neural Network Complexity = = Facetten der Komplexitat fur neuronale Netze.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Facets of Neural Network Complexity =/
其他題名:	Facetten der Komplexitat fur neuronale Netze.
其他題名:	Facetten der Komplexitat fur neuronale Netze.
作者:	Hertrich, Christoph.
面頁冊數:	1 online resource (127 pages)
附註:	Source: Dissertations Abstracts International, Volume: 84-01, Section: B.
Contained By:	Dissertations Abstracts International84-01B.
標題:	Neurons. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29176660click for full text (PQDT)
ISBN:	9798835549252

Facets of Neural Network Complexity = = Facetten der Komplexitat fur neuronale Netze.
Hertrich, Christoph.

Facets of Neural Network Complexity =Facetten der Komplexitat fur neuronale Netze.Facetten der Komplexitat fur neuronale Netze. - 1 online resource (127 pages)

Source: Dissertations Abstracts International, Volume: 84-01, Section: B.

Thesis (Ph.D.)--Technische Universitaet Berlin (Germany), 2022.

Includes bibliographical references

Artificial neural networks are at the heart of some of the greatest advances in modern technology. They enable huge breakthroughs in applications ranging from computer vision via machine translation to speech recognition as well as autonomous driving and many more. However, we are still far away from a more rigorous theoretical explanation of these overwhelming success stories. Consequently, the development of a better mathematical understanding of neural networks is currently one of the hottest research topics in computer science. In this thesis we provide several contributions in that direction for the simple, but practically powerful and widely used model of feedforward neural networks with rectified linear unit (ReLU) activations.Our focus is on various notions of what we call the complexity of such neural networks: how much computing resources (time, hardware, network size, etc.) are required to achieve a certain goal? Of course, such questions can be asked in various contexts. We identify and study the following three facets of complexity for neural networks with ReLU activations.The first facet is neural networks' expressivity: What functions can be represented by certain neural network architectures? Even though this is such a fundamental question, very little is known so far. We make progress concerning the question whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). We also provide upper bounds on the number of neurons required to represent arbitrary piecewise linear functions with small-depth ReLU neural networks. The second facet is neural networks' computational power. Here, we view neural networks as a model of computation, just like Boolean, or even closer, arithmetic circuits. We then investigate which network (or circuit) size is required to solve various problems, with a focus on combinatorial optimization problems. Even though this model is quite restrictive compared to other models of computation, we are able to show that comparably small neural networks can provably solve problems like the efficiently solvable Maximum Flow Problem or the NP-hard Knapsack Problem.The third facet is neural networks' training complexity: How difficult is it to fit the weights of a neural network to training data? It is widely known that optimal solutions to the training problem are hard to obtain, which is why local optimization techniques like stochastic gradient descent are used in practice. We focus on the question whether the situation improves for fixed input dimension, leading to the paradigm of parameterized complexity analysis. We provide running time lower bounds in terms of W[1]-hardness results, proving that known brute-force strategies are essentially optimal. On the positive side, we extend a known polynomial-time algorithm for constant dimension and convex loss functions to a more general class of loss functions.The mathematical methods used in this thesis include polyhedral theory, discrete and tropical geometry, mixed-integer and combinatorial optimization, as well as tools from complexity theory.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023

Mode of access: World Wide Web

ISBN: 9798835549252Subjects--Topical Terms:

588699
Neurons.
Index Terms--Genre/Form:

542853
Electronic books.

Facets of Neural Network Complexity = = Facetten der Komplexitat fur neuronale Netze.
LDR:07765nmm a2200385K 4500 001 2354192
005 20230324111223.5
006 m o d
007 cr mn ---uuuuu
008 241011s2022 xx obm 000 0 eng d
020 $a 9798835549252
035 $a (MiAaPQ)AAI29176660
035 $a (MiAaPQ)TechBerlin_1130316494
035 $a AAI29176660
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Hertrich, Christoph. $3 3694539
245 1 0 $a Facets of Neural Network Complexity = $b Facetten der Komplexitat fur neuronale Netze.
246 3 1 $a Facetten der Komplexitat fur neuronale Netze.
264 0 $c 2022
300 $a 1 online resource (127 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 84-01, Section: B.
500 $a Advisor: Sullivan, John.
502 $a Thesis (Ph.D.)--Technische Universitaet Berlin (Germany), 2022.
504 $a Includes bibliographical references
520 $a Artificial neural networks are at the heart of some of the greatest advances in modern technology. They enable huge breakthroughs in applications ranging from computer vision via machine translation to speech recognition as well as autonomous driving and many more. However, we are still far away from a more rigorous theoretical explanation of these overwhelming success stories. Consequently, the development of a better mathematical understanding of neural networks is currently one of the hottest research topics in computer science. In this thesis we provide several contributions in that direction for the simple, but practically powerful and widely used model of feedforward neural networks with rectified linear unit (ReLU) activations.Our focus is on various notions of what we call the complexity of such neural networks: how much computing resources (time, hardware, network size, etc.) are required to achieve a certain goal? Of course, such questions can be asked in various contexts. We identify and study the following three facets of complexity for neural networks with ReLU activations.The first facet is neural networks' expressivity: What functions can be represented by certain neural network architectures? Even though this is such a fundamental question, very little is known so far. We make progress concerning the question whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). We also provide upper bounds on the number of neurons required to represent arbitrary piecewise linear functions with small-depth ReLU neural networks. The second facet is neural networks' computational power. Here, we view neural networks as a model of computation, just like Boolean, or even closer, arithmetic circuits. We then investigate which network (or circuit) size is required to solve various problems, with a focus on combinatorial optimization problems. Even though this model is quite restrictive compared to other models of computation, we are able to show that comparably small neural networks can provably solve problems like the efficiently solvable Maximum Flow Problem or the NP-hard Knapsack Problem.The third facet is neural networks' training complexity: How difficult is it to fit the weights of a neural network to training data? It is widely known that optimal solutions to the training problem are hard to obtain, which is why local optimization techniques like stochastic gradient descent are used in practice. We focus on the question whether the situation improves for fixed input dimension, leading to the paradigm of parameterized complexity analysis. We provide running time lower bounds in terms of W[1]-hardness results, proving that known brute-force strategies are essentially optimal. On the positive side, we extend a known polynomial-time algorithm for constant dimension and convex loss functions to a more general class of loss functions.The mathematical methods used in this thesis include polyhedral theory, discrete and tropical geometry, mixed-integer and combinatorial optimization, as well as tools from complexity theory.
520 $a Kunstliche neuronale Netze sind der Schlussel zu einigen der grosten modernen Technologiefortschritten. Sie ermoglichen Durchbruche in Anwendungen wie Computer Vision, maschinellem Ubersetzen, Spracherkennung, bis hin zu autonomem Fahren und vielem mehr. Allerdings sind wir nach wie vor weit entfernt von rigorosen theoretischen Erklarungen dieser uberwaltigenden Erfolge. Daher ist die Entwicklung eines besseren mathematischen Verstandnisses neuronaler Netze aktuell eines der wichtigsten Forschungsthemen der Informatik. In dieser Arbeit prasentieren wir einige Fortschritte in diese Richtung fur das einfache, aber aus praktischer Sicht machtige und viel verwendete Modell der Feedforward-Netze mit ReLU-Aktivierungsfunktionen.Unser Fokus liegt auf verschiedenen Arten der Komplexitat solcher neuronalen Netze: Wie viele Ressourcen (Zeit, Hardware, Netzwerkgrose etc.) werden benotigt, um ein bestimmtes Ziel zu erreichen? Selbstverstandlich konnen solche Fragen in verschiedenen Kontexten gestellt werden. Wir identifizieren und untersuchen die folgenden drei Facetten der Komplexitat fur neuronale Netze mit ReLU-Aktivierungsfunktionen.Die erste Facette ist Expressivitat: Welche Funktionen konnen von bestimmten Netzwerkarchitekturen dargestellt werden? Obwohl es sich dabei um eine so fundamentale Frage handelt, ist bisher wenig bekannt. Wir prasentieren Fortschritte bezuglich der Frage, ob die Menge der exakt darstellbaren Funktionen echt groser wird, wenn wir mehr Schichten hinzufugen (ohne dabei die Grose zu beschranken). Wir beweisen auserdem obere Schranken an die Anzahl benotigter Neuronen, um beliebige stuckweise lineare Funktionen mit neuronalen Netzen mit geringer Tiefe zu berechnen.Die zweite Facette ist Berechnungskomplexitat (Computational Power). Dafur betrachten wir neuronale Netze als ein Berechenbarkeitsmodell, ahnlich wie etwa Boolesche oder arithmetische Schaltkreise. Wir untersuchen, welche Netzwerkgrose notig ist, um verschiedene Probleme, insbesondere aus der kombinatorischen Optimierung, zu losen. Trotz der starken Eingeschranktheit dieses Modells zeigen wir, dass vergleichsweise kleine Netze beweisbar Losungen fur Probleme wie das Maximalflussproblem oder das NP-schwere Rucksackproblem finden konnen.Die dritte Facette ist Trainingskomplexitat: Wie schwer ist es, die Gewichte eines neuronalen Netzes an Trainingsdaten anzupassen? Bekanntermasen ist das Finden exakter Losungen ein schweres Problem, weshalb in der Praxis lokale Optimierungsverfahren wie etwa stochastischer Gradientenabstieg verwendet werden. Wir beschaftigen uns damit, ob sich die Lage fur fixe Dimension verbessert, was uns zum Paradigma der parametrisierten Komplexitat fuhrt. Wir beweisen untere Laufzeitschranken in Form von W[1]-Harteresultaten, woraus folgt, dass bekannte Brute-Force-Strategien im Wesentlichen optimal sind. Auf der positiven Seite erweitern wir einen bekannten Polynomialzeitalgorithmus fur konstante Dimension und konvexe Lossfunktionen auf eine allgemeinere Klasse von Lossfunktionen.Die in dieser Arbeit verwendeten mathematischen Methoden umfassen Polyedertheorie, diskrete und tropische Geometrie, gemischt-ganzzahlige und kombinatorische Optimierung sowie Tools aus der Komplexitatstheorie.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2023
538 $a Mode of access: World Wide Web
650 4 $a Neurons. $3 588699
650 4 $a Integer programming. $3 648270
650 4 $a Success. $3 518195
650 4 $a Knapsack problem. $3 3685875
650 4 $a Power. $3 518736
650 4 $a Optimization. $3 891104
650 4 $a Neural networks. $3 677449
650 4 $a Combinatorics. $3 895617
650 4 $a Algorithms. $3 536374
650 4 $a Geometry. $3 517251
650 4 $a Artificial intelligence. $3 516317
650 4 $a Computer science. $3 523869
650 4 $a Mathematics. $3 515831
650 4 $a Operations research. $3 547123
655 7 $a Electronic books. $2 lcsh $3 542853
690 $a 0800
690 $a 0984
690 $a 0405
690 $a 0796
710 2 $a ProQuest Information and Learning Co. $3 783688
710 2 $a Technische Universitaet Berlin (Germany). $3 3480527
773 0 $t Dissertations Abstracts International $g 84-01B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29176660 $z click for full text (PQDT)