東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

FindBook

Google Book

Amazon

博客來

Online Learning and Decision Making with Partial Information, a Feedback Perspective.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Online Learning and Decision Making with Partial Information, a Feedback Perspective./
作者:	Rangi, Anshuka.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	448 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Contained By:	Dissertations Abstracts International83-02B.
標題:	Computer science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28541527
ISBN:	9798534669077

Online Learning and Decision Making with Partial Information, a Feedback Perspective.
Rangi, Anshuka.

Online Learning and Decision Making with Partial Information, a Feedback Perspective. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 448 p.

Source: Dissertations Abstracts International, Volume: 83-02, Section: B.

Thesis (Ph.D.)--University of California, San Diego, 2021.

This item must not be sold to any third party vendors.

This dissertation considers a problem of online learning and online decision making where an agent or a group of agents aim to learn unknown parameters of interest. There are two key interacting components: agent and environment. The agent perform actions on the environment, these actions may or may not change the state of the environment, and the environment generates feedback based on the actions and its underlying state. The feedback is utilized by the agent to learn and improvise its decisions and actions, and optimize a certain objective.In the first part of this dissertation, we consider different variants of the online learning and decision making systems. We propose optimal (or order-optimal) online learning algorithms for these variants. We characterize the flow of information through feedback, and provide quantitative information measures that are key to optimal learning and decision making in these systems. In the second part of this dissertation, we focus on the attacks and security of these online learning and decision making systems. Since the distributed nature of these systems is their Achilles' heel, making these systems secure requires an understanding of the regime where the systems can be attacked, as well as designing ways to mitigate these attacks. We study both of these aspects of the problem for stochastic Multi-Armed Bandits (MAB). We also study the former aspect of the problem, namely understanding the regime under which the system can be attacked, for Reinforcement Learning and Cyber-Physical systems.Finally, we lay the foundations of non-stochastic information theory. Classical information theory has little role in providing non-stochastic guarantees for online systems such as Cyber-Physical systems where occasional errors can quickly drive these systems out of control and lead to catastrophic failures. We propose a non-stochastic $\\delta$-mutual information to capture the worst-case error guarantees, denoted by $\\delta$. We propose a non-stochastic analogue of capacities which are studied in classical information theory. We also establish key results such as channel coding theorem and single-letter characterization for the non-stochastic capacities.

ISBN: 9798534669077Subjects--Topical Terms:

523869
Computer science.
Subjects--Index Terms:

Channel coding theorem

Online Learning and Decision Making with Partial Information, a Feedback Perspective.
LDR:03553nmm a2200409 4500 001 2349552
005 20230509091111.5
006 m o d
007 cr#unu||||||||
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798534669077
035 $a (MiAaPQ)AAI28541527
035 $a AAI28541527
040 $a MiAaPQ $c MiAaPQ
100 1 $a Rangi, Anshuka. $3 3688963
245 1 0 $a Online Learning and Decision Making with Partial Information, a Feedback Perspective.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 448 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
500 $a Advisor: Franceschetti, Massimo.
502 $a Thesis (Ph.D.)--University of California, San Diego, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a This dissertation considers a problem of online learning and online decision making where an agent or a group of agents aim to learn unknown parameters of interest. There are two key interacting components: agent and environment. The agent perform actions on the environment, these actions may or may not change the state of the environment, and the environment generates feedback based on the actions and its underlying state. The feedback is utilized by the agent to learn and improvise its decisions and actions, and optimize a certain objective.In the first part of this dissertation, we consider different variants of the online learning and decision making systems. We propose optimal (or order-optimal) online learning algorithms for these variants. We characterize the flow of information through feedback, and provide quantitative information measures that are key to optimal learning and decision making in these systems. In the second part of this dissertation, we focus on the attacks and security of these online learning and decision making systems. Since the distributed nature of these systems is their Achilles' heel, making these systems secure requires an understanding of the regime where the systems can be attacked, as well as designing ways to mitigate these attacks. We study both of these aspects of the problem for stochastic Multi-Armed Bandits (MAB). We also study the former aspect of the problem, namely understanding the regime under which the system can be attacked, for Reinforcement Learning and Cyber-Physical systems.Finally, we lay the foundations of non-stochastic information theory. Classical information theory has little role in providing non-stochastic guarantees for online systems such as Cyber-Physical systems where occasional errors can quickly drive these systems out of control and lead to catastrophic failures. We propose a non-stochastic $\\delta$-mutual information to capture the worst-case error guarantees, denoted by $\\delta$. We propose a non-stochastic analogue of capacities which are studied in classical information theory. We also establish key results such as channel coding theorem and single-letter characterization for the non-stochastic capacities.
590 $a School code: 0033.
650 4 $a Computer science. $3 523869
650 4 $a Statistics. $3 517247
650 4 $a Electrical engineering. $3 649834
650 4 $a Poisoning. $3 770903
650 4 $a Performance evaluation. $3 3562292
650 4 $a Distance learning. $3 3557921
653 $a Channel coding theorem
653 $a Distribution learning and estimation
653 $a Hypothesis testing
653 $a Multi-armed bandits
653 $a Non-stochastic information theory
653 $a Online learning
690 $a 0984
690 $a 0463
690 $a 0544
710 2 $a University of California, San Diego. $b Electrical and Computer Engineering. $3 3432690
773 0 $t Dissertations Abstracts International $g 83-02B.
790 $a 0033
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28541527