Absorbing Markov Decision Processes
DUFOUR, François
Méthodes avancées d’apprentissage statistique et de contrôle [ASTRAL]
Institut de Mathématiques de Bordeaux [IMB]
Institut Polytechnique de Bordeaux [Bordeaux INP]
Méthodes avancées d’apprentissage statistique et de contrôle [ASTRAL]
Institut de Mathématiques de Bordeaux [IMB]
Institut Polytechnique de Bordeaux [Bordeaux INP]
DUFOUR, François
Méthodes avancées d’apprentissage statistique et de contrôle [ASTRAL]
Institut de Mathématiques de Bordeaux [IMB]
Institut Polytechnique de Bordeaux [Bordeaux INP]
< Réduire
Méthodes avancées d’apprentissage statistique et de contrôle [ASTRAL]
Institut de Mathématiques de Bordeaux [IMB]
Institut Polytechnique de Bordeaux [Bordeaux INP]
Langue
en
Article de revue
Ce document a été publié dans
ESAIM: Control, Optimisation and Calculus of Variations. 2024
EDP Sciences
Date de soutenance
2024Résumé en anglais
In this paper, we study discrete-time absorbing Markov Decision Processes (MDP) with measurable state space and Borel action space with a given initial distribution.For such models, solutions to the characteristic equation ...Lire la suite >
In this paper, we study discrete-time absorbing Markov Decision Processes (MDP) with measurable state space and Borel action space with a given initial distribution.For such models, solutions to the characteristic equation that are not occupation measures may exist. Several necessary and sufficient conditions are provided to guarantee that any solution to the characteristic equation is an occupation measure.Under the so-called continuity-compactness conditions, we first show that a measure is precisely an occupation measure if and only if it satisfies the characteristic equation and an additional absolute continuity condition. Secondly, it is shown that the set of occupation measures is compact in the weak-strong topology if and only if the model is uniformly absorbing. Several examples are provided to illustrate our results.< Réduire
Origine
Importé de halUnités de recherche