語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
On characteristics of Markov decisio...
~
Ratitch, Bohdana.
FindBook
Google Book
Amazon
博客來
On characteristics of Markov decision processes and reinforcement learning in large domains.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
On characteristics of Markov decision processes and reinforcement learning in large domains./
作者:
Ratitch, Bohdana.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2005,
面頁冊數:
284 p.
附註:
Source: Dissertations Abstracts International, Volume: 68-02, Section: B.
Contained By:
Dissertations Abstracts International68-02B.
標題:
Computer science. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR12934
ISBN:
9780494129340
On characteristics of Markov decision processes and reinforcement learning in large domains.
Ratitch, Bohdana.
On characteristics of Markov decision processes and reinforcement learning in large domains.
- Ann Arbor : ProQuest Dissertations & Theses, 2005 - 284 p.
Source: Dissertations Abstracts International, Volume: 68-02, Section: B.
Thesis (Ph.D.)--McGill University (Canada), 2005.
This item must not be sold to any third party vendors.
Reinforcement learning is a general computational framework for learning sequential decision strategies from the interaction of an agent with a dynamic environment. In this thesis, we focus on value-based learning methods, which rely on computing utility values for different behavior strategies. Value-based reinforcement learning methods have a solid theoretical foundation and a growing history of successful applications to real-world problems. However, most existing theoretically-sound algorithms work for small problems only. For complex real-world decision tasks, approximate methods have to be used; in this case there is a significant gap between the existing theoretical results and the methodologies applied in practice. This thesis is devoted to the analysis of various factors that contribute to the difficulty of learning with popular reinforcement learning algorithms, as well as to developing new methods that facilitate the practical application of reinforcement learning techniques. In the first part of this thesis, we investigate properties of reinforcement learning tasks that influence the performance of value-based algorithms. We present five domain-independent quantitative attributes that can be used to measure various task characteristics. We study the effect of these characteristics on learning and how they can be used for improving the efficiency of existing algorithms. In particular, we develop one application that uses measurements of the proposed attributes for improving exploration (the process by which the agent gathers experience for learning good behavior strategies). In large realistic domains, function approximation methods have to be incorporated into reinforcement learning algorithms. The second part of this thesis is devoted to the use of a function approximation model based on Sparse Distributed Memories (SDMs) in approximate value-based methods. Like for all other function approximators, the success of using SDMs in reinforcement learning depends, to a large extent, on a good choice of the structure of the approximator. We propose a new technique for automatically selecting certain structural parameters of the SDM model on-line based on training data. Our algorithm takes into account the interaction of function approximation with reinforcement learning algorithms and avoids some of the difficulties faced by other methods from the existing literature. In our experiments, this method provides very good performance and is computationally efficient.
ISBN: 9780494129340Subjects--Topical Terms:
523869
Computer science.
On characteristics of Markov decision processes and reinforcement learning in large domains.
LDR
:03584nmm a2200301 4500
001
2206923
005
20190906083345.5
008
201008s2005 ||||||||||||||||| ||eng d
020
$a
9780494129340
035
$a
(MiAaPQ)AAINR12934
035
$a
AAINR12934
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Ratitch, Bohdana.
$3
3319190
245
1 0
$a
On characteristics of Markov decision processes and reinforcement learning in large domains.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2005
300
$a
284 p.
500
$a
Source: Dissertations Abstracts International, Volume: 68-02, Section: B.
500
$a
Publisher info.: Dissertation/Thesis.
502
$a
Thesis (Ph.D.)--McGill University (Canada), 2005.
506
$a
This item must not be sold to any third party vendors.
506
$a
This item must not be added to any third party search indexes.
520
$a
Reinforcement learning is a general computational framework for learning sequential decision strategies from the interaction of an agent with a dynamic environment. In this thesis, we focus on value-based learning methods, which rely on computing utility values for different behavior strategies. Value-based reinforcement learning methods have a solid theoretical foundation and a growing history of successful applications to real-world problems. However, most existing theoretically-sound algorithms work for small problems only. For complex real-world decision tasks, approximate methods have to be used; in this case there is a significant gap between the existing theoretical results and the methodologies applied in practice. This thesis is devoted to the analysis of various factors that contribute to the difficulty of learning with popular reinforcement learning algorithms, as well as to developing new methods that facilitate the practical application of reinforcement learning techniques. In the first part of this thesis, we investigate properties of reinforcement learning tasks that influence the performance of value-based algorithms. We present five domain-independent quantitative attributes that can be used to measure various task characteristics. We study the effect of these characteristics on learning and how they can be used for improving the efficiency of existing algorithms. In particular, we develop one application that uses measurements of the proposed attributes for improving exploration (the process by which the agent gathers experience for learning good behavior strategies). In large realistic domains, function approximation methods have to be incorporated into reinforcement learning algorithms. The second part of this thesis is devoted to the use of a function approximation model based on Sparse Distributed Memories (SDMs) in approximate value-based methods. Like for all other function approximators, the success of using SDMs in reinforcement learning depends, to a large extent, on a good choice of the structure of the approximator. We propose a new technique for automatically selecting certain structural parameters of the SDM model on-line based on training data. Our algorithm takes into account the interaction of function approximation with reinforcement learning algorithms and avoids some of the difficulties faced by other methods from the existing literature. In our experiments, this method provides very good performance and is computationally efficient.
590
$a
School code: 0781.
650
4
$a
Computer science.
$3
523869
690
$a
0984
710
2
$a
McGill University (Canada).
$3
1018122
773
0
$t
Dissertations Abstracts International
$g
68-02B.
790
$a
0781
791
$a
Ph.D.
792
$a
2005
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR12934
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9383472
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入