|
Flinders Academic Commons >
Flinders Digital Archive >
Science and Engineering >
Computer Science, Engineering and Mathematics >
Computer Science, Engineering and Mathematics - Collected Works >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2328/26402
|
| Title: | A weighted Markov decision process |
| Authors: | Krass, Dmitry Filar, Jerzy A Sinha, Sagnik S |
| Keywords: | Mathematics Markov Decision Process |
| Issue Date: | 1992 |
| Publisher: | INFORMS |
| Citation: | Krass, D., Filar, J.A. and Sinha, S.S., 1992. A weighted Markov decision process. Operations Research, 40(6), 1180-1187. |
| Abstract: | The two most commonly considered reward criteria for Markov decision processes are the discounted reward and the long-term average reward. The first tends to "neglect" the future, concentrating on the short-term rewards, while the second one tends to do the opposite. We consider a new reward criterion consisting of the weighted combination of these two criteria, thereby allowing the decision maker to place more or less emphasis on the short-term versus the long-term rewards by varying their weights. The mathematical implications of the new criterion include: the deterministic stationary policies can be outperformed by the randomized stationary policies, which in turn can be outperformed by the nonstationary policies; an optimal policy might not exist. We present an iterative algorithm for computing an e-optimal nonstationary policy with a very simple structure. |
| URI: | http://hdl.handle.net/2328/26402 |
| ISSN: | 0030-364X |
| Appears in Collections: | Computer Science, Engineering and Mathematics - Collected Works
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|