Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
| hal.structure.identifier | Institut de Mathématiques de Bordeaux [IMB] | |
| hal.structure.identifier | Quality control and dynamic reliability [CQFD] | |
| dc.contributor.author | DUFOUR, François | |
| hal.structure.identifier | Department of Statistics and Operations Research [Madrid] | |
| dc.contributor.author | PRIETO-RUMEAU, Tomás | |
| dc.date.accessioned | 2024-04-04T03:16:40Z | |
| dc.date.available | 2024-04-04T03:16:40Z | |
| dc.date.issued | 2015 | |
| dc.identifier.issn | 1744-2508 | |
| dc.identifier.uri | https://oskar-bordeaux.fr/handle/20.500.12278/194237 | |
| dc.description.abstractEn | We consider a discrete-time Markov decision process with Borel state and action spaces, and possibly unbounded cost function. We assume that the Markov transition kernel is absolutely continuous with respect to some probability measure . By replacing this probability measure with its empirical distribution for a sample of size n, we obtain a finite state space control problem, which is used to provide an approximation of the optimal value and an optimal policy of the original control model. We impose Lipschitz continuity properties on the control model and its associated density functions. We measure the accuracy of the approximation of the optimal value and an optimal policy by means of a non-asymptotic concentration inequality based on the 1-Wasserstein distance between and . Obtaining numerically the solution of the approximating control model is discussed and an application to an inventory management problem is presented. | |
| dc.language.iso | en | |
| dc.publisher | Taylor & Francis: STM, Behavioural Science and Public Health Titles | |
| dc.subject.en | Wasserstein distance | |
| dc.subject.en | Concentration inequalities | |
| dc.subject.en | Approximation of the optimal value and an optimal policy | |
| dc.subject.en | Long-run average cost | |
| dc.subject.en | Markov decision processes | |
| dc.title.en | Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities | |
| dc.type | Article de revue | |
| dc.identifier.doi | 10.1080/17442508.2014.939979 | |
| dc.subject.hal | Mathématiques [math]/Optimisation et contrôle [math.OC] | |
| bordeaux.journal | Stochastics: An International Journal of Probability and Stochastic Processes | |
| bordeaux.page | 273 - 307 | |
| bordeaux.volume | 87 | |
| bordeaux.hal.laboratories | Institut de Mathématiques de Bordeaux (IMB) - UMR 5251 | * |
| bordeaux.issue | 2 | |
| bordeaux.institution | Université de Bordeaux | |
| bordeaux.institution | Bordeaux INP | |
| bordeaux.institution | CNRS | |
| bordeaux.peerReviewed | oui | |
| hal.identifier | hal-01246225 | |
| hal.version | 1 | |
| hal.popular | non | |
| hal.audience | Internationale | |
| hal.origin.link | https://hal.archives-ouvertes.fr//hal-01246225v1 | |
| bordeaux.COinS | ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Stochastics:%20An%20International%20Journal%20of%20Probability%20and%20Stochastic%20Processes&rft.date=2015&rft.volume=87&rft.issue=2&rft.spage=273%20-%20307&rft.epage=273%20-%20307&rft.eissn=1744-2508&rft.issn=1744-2508&rft.au=DUFOUR,%20Fran%C3%A7ois&PRIETO-RUMEAU,%20Tom%C3%A1s&rft.genre=article |
Files in this item
| Files | Size | Format | View |
|---|---|---|---|
|
There are no files associated with this item. |
|||