Weighted Markov Decision Processes with perturbation
Filar, Jerzy A
MetadataShow full item record
In this paper we consider the weighted reward MDP’s with perturbation. We give the proof of existence of a delta-optimal simple ultimately deterministic policy under the assumption of “scalar value”. We also prove that there exists a delta-i-optimal simple ultimately deterministic policy in the perturbed weighted MDP, for all e E [0, e*) even without the assumption of “scalar value”.