Afficher la notice abrégée

hal.structure.identifierInstitut Polytechnique de Bordeaux [Bordeaux INP]
hal.structure.identifierQuality control and dynamic reliability [CQFD]
hal.structure.identifierInstitut de Mathématiques de Bordeaux [IMB]
dc.contributor.authorDUFOUR, François
hal.structure.identifierInstitut de Mathématiques de Bordeaux [IMB]
hal.structure.identifierQuality control and dynamic reliability [CQFD]
dc.contributor.authorGENADOT, Alexandre
dc.date.accessioned2024-04-04T03:03:19Z
dc.date.available2024-04-04T03:03:19Z
dc.date.issued2020
dc.identifier.issn0095-4616
dc.identifier.urihttps://oskar-bordeaux.fr/handle/20.500.12278/193059
dc.description.abstractEnWe consider a discrete-time Markov decision process with Borel state and action spaces. The performance criterion is to maximize a total expected utility determined by unbounded return function. It is shown the existence of optimal strategies under general conditions allowing the reward function to be unbounded both from above and below and the action sets available at each step to the decision maker to be not necessarily compact. To deal with unbounded reward functions, a new characterization for the weak convergence of probability measures is derived. Our results are illustrated by examples.
dc.language.isoen
dc.publisherSpringer Verlag (Germany)
dc.subject.enMarkov decision processes
dc.subject.enExpected total reward
dc.subject.enUnbounded return
dc.subject.enWeak convergence of measure
dc.title.enOn the Expected Total Reward with Unbounded Returns for Markov Decision Processes
dc.typeArticle de revue
dc.identifier.doi10.1007/s00245-018-9533-6
dc.subject.halMathématiques [math]/Optimisation et contrôle [math.OC]
bordeaux.journalApplied Mathematics and Optimization
bordeaux.page433-450
bordeaux.volume82
bordeaux.hal.laboratoriesInstitut de Mathématiques de Bordeaux (IMB) - UMR 5251*
bordeaux.issue2
bordeaux.institutionUniversité de Bordeaux
bordeaux.institutionBordeaux INP
bordeaux.institutionCNRS
bordeaux.peerReviewedoui
hal.identifierhal-01953985
hal.version1
hal.popularnon
hal.audienceInternationale
hal.origin.linkhttps://hal.archives-ouvertes.fr//hal-01953985v1
bordeaux.COinSctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=Applied%20Mathematics%20and%20Optimization&rft.date=2020&rft.volume=82&rft.issue=2&rft.spage=433-450&rft.epage=433-450&rft.eissn=0095-4616&rft.issn=0095-4616&rft.au=DUFOUR,%20Fran%C3%A7ois&GENADOT,%20Alexandre&rft.genre=article


Fichier(s) constituant ce document

FichiersTailleFormatVue

Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée