A Computational Note on Markov Decision Processes Without Discounting

dc.contributor.authorPfeiffer, Paul E.en_US
dc.contributor.authorDennis, J.E. Jr.en_US
dc.date.accessioned2018-06-18T17:27:38Zen_US
dc.date.available2018-06-18T17:27:38Zen_US
dc.date.issued1987-07en_US
dc.date.noteJuly 1987en_US
dc.description.abstractThe Markov decision process is treated in a variety of forms or cases: finite or infinite horizon, with or without discounting. The finite horizon cases and the case of infinite horizon with discounting have received considerable attention. In the infinite horizon case, with discounting, the problem either receives a linear programming treatment or is treated by the elegant and effective policy-iteration procedure by Ronald Howard. In the undiscounted case, however, a special form of this procedure is required, which detracts from the directness and elegance of the method. The difficulty comes in the step generally called the value-determination procedure. The equations used in this step are linearly dependent, so that the solution of the system of linear equations requires some adjustment. We propose a new computational procedure which avoids this difficulty and works directly with the average next-period gains and powers of the transition probability matrix. The fundamental computational tools are matrix multiplication and addition.en_US
dc.format.extent10 ppen_US
dc.identifier.citationPfeiffer, Paul E. and Dennis, J.E. Jr.. "A Computational Note on Markov Decision Processes Without Discounting." (1987) <a href="https://hdl.handle.net/1911/101629">https://hdl.handle.net/1911/101629</a>.en_US
dc.identifier.digitalTR87-19en_US
dc.identifier.urihttps://hdl.handle.net/1911/101629en_US
dc.language.isoengen_US
dc.titleA Computational Note on Markov Decision Processes Without Discountingen_US
dc.typeTechnical reporten_US
dc.type.dcmiTexten_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR87-19.pdf
Size:
136.25 KB
Format:
Adobe Portable Document Format