語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Approximate Policy Iteration Algorit...
~
Ma, Jun.
FindBook
Google Book
Amazon
博客來
Approximate Policy Iteration Algorithms for Continuous, Multidimensional Applications and Convergence Analysis.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Approximate Policy Iteration Algorithms for Continuous, Multidimensional Applications and Convergence Analysis./
作者:
Ma, Jun.
面頁冊數:
161 p.
附註:
Source: Dissertation Abstracts International, Volume: 72-06, Section: B, page: .
Contained By:
Dissertation Abstracts International72-06B.
標題:
Business Administration, Management. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3452619
ISBN:
9781124595061
Approximate Policy Iteration Algorithms for Continuous, Multidimensional Applications and Convergence Analysis.
Ma, Jun.
Approximate Policy Iteration Algorithms for Continuous, Multidimensional Applications and Convergence Analysis.
- 161 p.
Source: Dissertation Abstracts International, Volume: 72-06, Section: B, page: .
Thesis (Ph.D.)--Princeton University, 2011.
The purpose of this dissertation is to present parametric and non-parametric policy iteration algorithms that handle Markov decision process problems with high-dimensional, continuous state and action spaces and to conduct convergence analysis of these algorithms under a variety of technical conditions. An online, on-policy least-squares policy iteration (LSPI) algorithm is proposed, which can be applied to infinite horizon problems with where states and controls are vector-valued and continuous. No special problem structure such as linear, additive noise is assumed, and the expectation is assumably uncomputable. The concept of the post-decision state variable is used to eliminate the expectation inside the optimization problem, and a formal convergence analysis of the algorithm is provided under the assumption that value functions are spanned by finitely many known basis functions. Furthermore, the convergence result extends to the more general case of unknown value function form using orthogonal polynomial approximation. Using kernel smoothing techniques, this dissertation presents three different online, on-policy approximate policy iteration algorithms which can be applied to infinite horizon problems with continuous and high-dimensional state and action spaces. They are kernel-based least squares approximate policy iteration, approximate policy iteration with kernel smoothing and policy iteration with finite horizon approximation and kernel estimators. The use of Monte Carlo sampling to estimate the value function around the post-decision state reduces the problem to a sequence of deterministic, nonlinear programming problems that allow the algorithms to handle continuous, vector-valued states and actions. Again, a formal convergence analysis of the algorithms under a variety of technical assumptions is presented. The algorithms are applied to different numerical applications including linear quadratic regulation, wind energy allocation and battery storage problems to demonstrate their effectiveness and convergence properties.
ISBN: 9781124595061Subjects--Topical Terms:
626628
Business Administration, Management.
Approximate Policy Iteration Algorithms for Continuous, Multidimensional Applications and Convergence Analysis.
LDR
:03007nam 2200289 4500
001
1404919
005
20111201133000.5
008
130515s2011 ||||||||||||||||| ||eng d
020
$a
9781124595061
035
$a
(UMI)AAI3452619
035
$a
AAI3452619
040
$a
UMI
$c
UMI
100
1
$a
Ma, Jun.
$3
1265143
245
1 0
$a
Approximate Policy Iteration Algorithms for Continuous, Multidimensional Applications and Convergence Analysis.
300
$a
161 p.
500
$a
Source: Dissertation Abstracts International, Volume: 72-06, Section: B, page: .
500
$a
Adviser: Warren B. Powell.
502
$a
Thesis (Ph.D.)--Princeton University, 2011.
520
$a
The purpose of this dissertation is to present parametric and non-parametric policy iteration algorithms that handle Markov decision process problems with high-dimensional, continuous state and action spaces and to conduct convergence analysis of these algorithms under a variety of technical conditions. An online, on-policy least-squares policy iteration (LSPI) algorithm is proposed, which can be applied to infinite horizon problems with where states and controls are vector-valued and continuous. No special problem structure such as linear, additive noise is assumed, and the expectation is assumably uncomputable. The concept of the post-decision state variable is used to eliminate the expectation inside the optimization problem, and a formal convergence analysis of the algorithm is provided under the assumption that value functions are spanned by finitely many known basis functions. Furthermore, the convergence result extends to the more general case of unknown value function form using orthogonal polynomial approximation. Using kernel smoothing techniques, this dissertation presents three different online, on-policy approximate policy iteration algorithms which can be applied to infinite horizon problems with continuous and high-dimensional state and action spaces. They are kernel-based least squares approximate policy iteration, approximate policy iteration with kernel smoothing and policy iteration with finite horizon approximation and kernel estimators. The use of Monte Carlo sampling to estimate the value function around the post-decision state reduces the problem to a sequence of deterministic, nonlinear programming problems that allow the algorithms to handle continuous, vector-valued states and actions. Again, a formal convergence analysis of the algorithms under a variety of technical assumptions is presented. The algorithms are applied to different numerical applications including linear quadratic regulation, wind energy allocation and battery storage problems to demonstrate their effectiveness and convergence properties.
590
$a
School code: 0181.
650
4
$a
Business Administration, Management.
$3
626628
650
4
$a
Engineering, System Science.
$3
1018128
650
4
$a
Operations Research.
$3
626629
690
$a
0454
690
$a
0790
690
$a
0796
710
2
$a
Princeton University.
$3
645579
773
0
$t
Dissertation Abstracts International
$g
72-06B.
790
1 0
$a
Powell, Warren B.,
$e
advisor
790
$a
0181
791
$a
Ph.D.
792
$a
2011
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3452619
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9168058
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入