Adapting the Exploration Rate for Value-of-Information-Based Reinforcement Learning

Sledge, Isaac J.; Principe, Jose C.

Computer Science > Machine Learning

arXiv:2212.11083 (cs)

[Submitted on 20 Dec 2022 (v1), last revised 31 Dec 2022 (this version, v2)]

Title:Adapting the Exploration Rate for Value-of-Information-Based Reinforcement Learning

Authors:Isaac J. Sledge, Jose C. Principe

View PDF

Abstract:In this paper, we consider the problem of adjusting the exploration rate when using value-of-information-based exploration. We do this by converting the value-of-information optimization into a problem of finding equilibria of a flow for a changing exploration rate. We then develop an efficient path-following scheme for converging to these equilibria and hence uncovering optimal action-selection policies. Under this scheme, the exploration rate is automatically adapted according to the agent's experiences. Global convergence is theoretically assured.
We first evaluate our exploration-rate adaptation on the Nintendo GameBoy games Centipede and Millipede. We demonstrate aspects of the search process, like that it yields a hierarchy of state abstractions. We also show that our approach returns better policies in fewer episodes than conventional search strategies relying on heuristic, annealing-based exploration-rate adjustments. We then illustrate that these trends hold for deep, value-of-information-based agents that learn to play ten simple games and over forty more complicated games for the Nintendo GameBoy system. Performance either near or well above the level of human play is observed.

Comments:	Submitted to the IEEE Transactions on Information Theory
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Information Theory (cs.IT)
Cite as:	arXiv:2212.11083 [cs.LG]
	(or arXiv:2212.11083v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2212.11083

Submission history

From: Isaac Sledge [view email]
[v1] Tue, 20 Dec 2022 09:53:22 UTC (43,538 KB)
[v2] Sat, 31 Dec 2022 04:13:31 UTC (42,629 KB)

Computer Science > Machine Learning

Title:Adapting the Exploration Rate for Value-of-Information-Based Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Adapting the Exploration Rate for Value-of-Information-Based Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators