DUFOUR, François; HORIGUCHI, Masayuki; PIUNOVSKIY, Alexei

La plateforme OSKAR Bordeaux évolue pour rejoindre l'archive ouverte HAL. Retrouvez tous vos dépôts sur le nouveau portail HAL UB : https://u-bordeaux.hal.science/ Pour toute aide ou information, contactez-nous : info@oskar-bordeaux.fr

Afficher la notice abrégée

hal.structure.identifier	Institut de Mathématiques de Bordeaux [IMB]
hal.structure.identifier	Quality control and dynamic reliability [CQFD]
dc.contributor.author	DUFOUR, François
hal.structure.identifier	Department of Mathematics
dc.contributor.author	HORIGUCHI, Masayuki
hal.structure.identifier	Department of Mathematical Sciences [Liverpool]
dc.contributor.author	PIUNOVSKIY, Alexei
dc.date.accessioned	2024-04-04T02:23:53Z
dc.date.available	2024-04-04T02:23:53Z
dc.date.issued	2012
dc.identifier.issn	0001-8678
dc.identifier.uri	https://oskar-bordeaux.fr/handle/20.500.12278/189777
dc.description.abstractEn	This paper deals with discrete-time Markov Decision Processes (MDP's) under constraints where all the objectives have the same form of an expected total cost over the infinite time horizon. The existence of an optimal control policy is discussed by using the convex analytic approach. We work under the assumptions that the state and action spaces are general Borel spaces and the model is non-negative, semi-continuous and there exists an admissible solution with finite cost for the associated linear program. It is worth noting that, in contrast with the classical results of the literature, our hypotheses do not require the MDP to be transient or absorbing. Our first result ensures the existence of an optimal solution to the linear program given by an occupation measure of the process generated by a randomized stationary policy. Moreover, it is shown that this randomized stationary policy provides an optimal solution to this Markov control problem. As a consequence, these results imply that the set of randomized stationary policies is a sufficient set for this optimal control problem. Finally, our last main result states that all optimal solutions of the linear program coincide on a special set with an optimal occupation measure generated by a randomized stationary policy. Several examples are presented to illustrate some theoretical issues and the possible applications of the results developed in the paper.
dc.language.iso	en
dc.publisher	Applied Probability Trust
dc.title.en	The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach
dc.type	Article de revue
dc.subject.hal	Mathématiques [math]/Optimisation et contrôle [math.OC]
bordeaux.journal	Advances in Applied Probability
bordeaux.page	774-793
bordeaux.volume	44
bordeaux.hal.laboratories	Institut de Mathématiques de Bordeaux (IMB) - UMR 5251	*
bordeaux.issue	3
bordeaux.institution	Université de Bordeaux
bordeaux.institution	Bordeaux INP
bordeaux.institution	CNRS
bordeaux.peerReviewed	oui
hal.identifier	hal-00759717
hal.version	1
hal.popular	non
hal.audience	Internationale
hal.origin.link	https://hal.archives-ouvertes.fr//hal-00759717v1
bordeaux.COinS	ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Advances%20in%20Applied%20Probability&rft.date=2012&rft.volume=44&rft.issue=3&rft.spage=774-793&rft.epage=774-793&rft.eissn=0001-8678&rft.issn=0001-8678&rft.au=DUFOUR,%20Fran%C3%A7ois&HORIGUCHI,%20Masayuki&PIUNOVSKIY,%20Alexei&rft.genre=article

Fichier(s) constituant ce document

Fichiers	Taille	Format	Vue
Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Institut de Mathématiques de Bordeaux (IMB) - UMR 5251

Afficher la notice abrégée

The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach

Fichier(s) constituant ce document

Ce document figure dans la(les) collection(s) suivante(s)