語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Optimization-based approximate dynam...
~
Petrik, Marek.
FindBook
Google Book
Amazon
博客來
Optimization-based approximate dynamic programming.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Optimization-based approximate dynamic programming./
作者:
Petrik, Marek.
面頁冊數:
361 p.
附註:
Source: Dissertation Abstracts International, Volume: 71-12, Section: B, page: 7513.
Contained By:
Dissertation Abstracts International71-12B.
標題:
Applied Mathematics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3427564
ISBN:
9781124320250
Optimization-based approximate dynamic programming.
Petrik, Marek.
Optimization-based approximate dynamic programming.
- 361 p.
Source: Dissertation Abstracts International, Volume: 71-12, Section: B, page: 7513.
Thesis (Ph.D.)--University of Massachusetts Amherst, 2010.
Reinforcement learning algorithms hold promise in many complex domains, such as resource management and planning under uncertainty. Most reinforcement learning algorithms are iterative---they successively approximate the solution based on a set of samples and features. Although these iterative algorithms can achieve impressive results in some domains, they are not sufficiently reliable for wide applicability; they often require extensive parameter tweaking to work well and provide only weak guarantees of solution quality.
ISBN: 9781124320250Subjects--Topical Terms:
1669109
Applied Mathematics.
Optimization-based approximate dynamic programming.
LDR
:03597nam 2200361 4500
001
1401617
005
20111017084407.5
008
130515s2010 ||||||||||||||||| ||eng d
020
$a
9781124320250
035
$a
(UMI)AAI3427564
035
$a
AAI3427564
040
$a
UMI
$c
UMI
100
1
$a
Petrik, Marek.
$3
1680763
245
1 0
$a
Optimization-based approximate dynamic programming.
300
$a
361 p.
500
$a
Source: Dissertation Abstracts International, Volume: 71-12, Section: B, page: 7513.
500
$a
Adviser: Shlomo Zilberstein.
502
$a
Thesis (Ph.D.)--University of Massachusetts Amherst, 2010.
520
$a
Reinforcement learning algorithms hold promise in many complex domains, such as resource management and planning under uncertainty. Most reinforcement learning algorithms are iterative---they successively approximate the solution based on a set of samples and features. Although these iterative algorithms can achieve impressive results in some domains, they are not sufficiently reliable for wide applicability; they often require extensive parameter tweaking to work well and provide only weak guarantees of solution quality.
520
$a
Some of the most interesting reinforcement learning algorithms are based on approximate dynamic programming (ADP). ADP, also known as value function approximation, approximates the value of being in each state. This thesis presents new reliable algorithms for ADP that use optimization instead of iterative improvement. Because these optimization--based algorithms explicitly seek solutions with favorable properties, they are easy to analyze, offer much stronger guarantees than iterative algorithms, and have few or no parameters to tweak. In particular, we improve on approximate linear programming --- an existing method --- and derive approximate bilinear programming --- a new robust approximate method.
520
$a
The strong guarantees of optimization--based algorithms not only increase confidence in the solution quality, but also make it easier to combine the algorithms with other ADP components. The other components of ADP are samples and features used to approximate the value function. Relying on the simplified analysis of optimization--based methods, we derive new bounds on the error due to missing samples. These bounds are simpler, tighter, and more practical than the existing bounds for iterative algorithms and can be used to evaluate solution quality in practical settings. Finally, we propose homotopy methods that use the sampling bounds to automatically select good approximation features for optimization--based algorithms. Automatic feature selection significantly increases the flexibility and applicability of the proposed ADP methods.
520
$a
The methods presented in this thesis can potentially be used in many practical applications in artificial intelligence, operations research, and engineering. Our experimental results show that optimization--based methods may perform well on resource-management problems and standard benchmark problems and therefore represent an attractive alternative to traditional iterative methods.
590
$a
School code: 0118.
650
4
$a
Applied Mathematics.
$3
1669109
650
4
$a
Artificial Intelligence.
$3
769149
690
$a
0364
690
$a
0800
710
2
$a
University of Massachusetts Amherst.
$b
Computer Science.
$3
1023848
773
0
$t
Dissertation Abstracts International
$g
71-12B.
790
1 0
$a
Zilberstein, Shlomo,
$e
advisor
790
1 0
$a
Barto, Andrew G.
$e
committee member
790
1 0
$a
Mahadevan, Sridhar
$e
committee member
790
1 0
$a
Muriel, Ana
$e
committee member
790
1 0
$a
Parr, Ronald
$e
committee member
790
$a
0118
791
$a
Ph.D.
792
$a
2010
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3427564
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9164756
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入