"Near Sample-Optimal Reduction-based Policy Learning for Average Reward MDP."

Jinghan Wang, Mengdi Wang, Lin F. Yang (2022)

Details and statistics

DOI: 10.48550/ARXIV.2212.00603

access: open

type: Informal or Other Publication

metadata version: 2024-05-07