DUFOUR, François; PRIETO-RUMEAU, Tomás

doi:10.1080/17442508.2014.939979

La plateforme OSKAR Bordeaux évolue pour rejoindre l'archive ouverte HAL. Retrouvez tous vos dépôts sur le nouveau portail HAL UB : https://u-bordeaux.hal.science/. Pour toute aide ou information, contactez-nous info@oskar-bordeaux.fr

Métadonnées

Afficher la notice complète

Licence d’utilisation du document

DUFOUR, François
Institut de Mathématiques de Bordeaux [IMB]
Quality control and dynamic reliability [CQFD]

PRIETO-RUMEAU, Tomás
Department of Statistics and Operations Research [Madrid]

Langue

Article de revue

Ce document a été publié dans

Stochastics: An International Journal of Probability and Stochastic Processes. 2015, vol. 87, n° 2, p. 273 - 307

Taylor & Francis: STM, Behavioural Science and Public Health Titles

Résumé en anglais

We consider a discrete-time Markov decision process with Borel state and action spaces, and possibly unbounded cost function. We assume that the Markov transition kernel is absolutely continuous with respect to some probability measure . By replacing this probability measure with its empirical distribution for a sample of size n, we obtain a finite state space control problem, which is used to provide an approximation of the optimal value and an optimal policy of the original control model. We impose Lipschitz continuity properties on the control model and its associated density functions. We measure the accuracy of the approximation of the optimal value and an optimal policy by means of a non-asymptotic concentration inequality based on the 1-Wasserstein distance between and . Obtaining numerically the solution of the approximating control model is discussed and an application to an inventory management problem is presented.< Réduire

Mots clés en anglais

Wasserstein distance

Concentration inequalities

Approximation of the optimal value and an optimal policy

Long-run average cost

Markov decision processes

Métadonnées

Partager cette publication !

Licence d’utilisation du document

Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities

Langue

Ce document a été publié dans

Résumé en anglais

Mots clés en anglais

URI

DOI

Origine

Unités de recherche