東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Selected machine learning reductions.

Choromanska, Anna.

FindBook

Google Book

Amazon

博客來

Selected machine learning reductions.

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	Selected machine learning reductions./
作者:	Choromanska, Anna.
面頁冊數:	192 p.
附註:	Source: Dissertation Abstracts International, Volume: 75-08(E), Section: B.
Contained By:	Dissertation Abstracts International75-08B(E).
標題:	Computer Science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3615068
ISBN:	9781303808517

Selected machine learning reductions.
Choromanska, Anna.

Selected machine learning reductions. - 192 p.

Source: Dissertation Abstracts International, Volume: 75-08(E), Section: B.

Thesis (Ph.D.)--Columbia University, 2014.

Machine learning is a field of science aiming to extract knowledge from the data. Optimization lies in the core of machine learning as many learning problems are formulated as optimization problems, where the goal is to minimize/maximize an objective function. More complex machine learning problems are then often solved by reducing them to simpler sub-problems solvable by known optimization techniques. This dissertation addresses two elements of the machine learning system 'pipeline', designing efficient basic optimization tools tailored to solve specific learning problems, or in other words optimize a specific objective function, and creating more elaborate learning tools with sub-blocks being essentially optimization solvers equipped with such basic optimization tools. In the first part of this thesis we focus on a very specific learning problem where the objective function, either convex or non-convex, involves the minimization of the partition function, the normalizer of a distribution, as is the case in conditional random fields (CRFs) or log-linear models. Our work proposes a tight quadratic bound on the partition function which parameters are easily recovered by a simple algorithm that we propose. The bound gives rise to the family of new optimization learning algorithms, based on bound majorization (we developed batch, both full-rank and low-rank, and semi-stochastic variants), with linear convergence rate that successfully compete with state-of-the-art techniques (among them gradient descent methods, Newton and quasi-Newton methods like L-BFGS, etc.). The only constraint we introduce is on the number of classes which is assumed to be finite and enumerable. The bound majorization method we develop is simultaneously the first reduction scheme discussed in this thesis, where throughout this thesis by 'reduction' we understand the learning approach or algorithmic technique converting a complex machine learning problem into a set of simpler problems (that can be as small as a single problem).

ISBN: 9781303808517Subjects--Topical Terms:

626642
Computer Science.

Selected machine learning reductions.
LDR:03863nam a2200289 4500 001 1966977
005 20141112075559.5
008 150210s2014 ||||||||||||||||| ||eng d
020 $a 9781303808517
035 $a (MiAaPQ)AAI3615068
035 $a AAI3615068
040 $a MiAaPQ $c MiAaPQ
100 1 $a Choromanska, Anna. $3 2103884
245 1 0 $a Selected machine learning reductions.
300 $a 192 p.
500 $a Source: Dissertation Abstracts International, Volume: 75-08(E), Section: B.
500 $a Adviser: Tony Jebara.
502 $a Thesis (Ph.D.)--Columbia University, 2014.
520 $a Machine learning is a field of science aiming to extract knowledge from the data. Optimization lies in the core of machine learning as many learning problems are formulated as optimization problems, where the goal is to minimize/maximize an objective function. More complex machine learning problems are then often solved by reducing them to simpler sub-problems solvable by known optimization techniques. This dissertation addresses two elements of the machine learning system 'pipeline', designing efficient basic optimization tools tailored to solve specific learning problems, or in other words optimize a specific objective function, and creating more elaborate learning tools with sub-blocks being essentially optimization solvers equipped with such basic optimization tools. In the first part of this thesis we focus on a very specific learning problem where the objective function, either convex or non-convex, involves the minimization of the partition function, the normalizer of a distribution, as is the case in conditional random fields (CRFs) or log-linear models. Our work proposes a tight quadratic bound on the partition function which parameters are easily recovered by a simple algorithm that we propose. The bound gives rise to the family of new optimization learning algorithms, based on bound majorization (we developed batch, both full-rank and low-rank, and semi-stochastic variants), with linear convergence rate that successfully compete with state-of-the-art techniques (among them gradient descent methods, Newton and quasi-Newton methods like L-BFGS, etc.). The only constraint we introduce is on the number of classes which is assumed to be finite and enumerable. The bound majorization method we develop is simultaneously the first reduction scheme discussed in this thesis, where throughout this thesis by 'reduction' we understand the learning approach or algorithmic technique converting a complex machine learning problem into a set of simpler problems (that can be as small as a single problem).
520 $a Secondly, we focus on developing two more sophisticated machine learning tools, for solving harder learning problems. The tools that we develop are built from basic optimization sub-blocks tailored to solve simpler optimization sub-problems. We first focus on the multi class classification problem where the number of classes is very large. We reduce this problem to a set of simpler sub-problems that we solve using basic optimization methods performing additive update on the parameter vector. Secondly we address the problem of learning data representation when the data is unlabeled for any classification task. We reduce this problem to a set of simpler sub-problems that we solve using basic optimization methods, however this time the parameter vector is updated multiplicatively. In both problems we assume that the data come in a stream that can even be infinite. We will now provide more specific description of each of these problems and describe our approach for solving them.
590 $a School code: 0054.
650 4 $a Computer Science. $3 626642
650 4 $a Engineering, Computer. $3 1669061
690 $a 0984
690 $a 0464
710 2 $a Columbia University. $b Electrical Engineering. $3 1675652
773 0 $t Dissertation Abstracts International $g 75-08B(E).
790 $a 0054
791 $a Ph.D.
792 $a 2014
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3615068