A Computational Note on Markov Decision Processes Without Discounting

dc.contributor.authorPfeiffer, Paul E.
dc.contributor.authorDennis, J.E. Jr.
dc.date.accessioned2018-06-18T17:27:38Z
dc.date.available2018-06-18T17:27:38Z
dc.date.issued1987-07
dc.date.noteJuly 1987
dc.description.abstractThe Markov decision process is treated in a variety of forms or cases: finite or infinite horizon, with or without discounting. The finite horizon cases and the case of infinite horizon with discounting have received considerable attention. In the infinite horizon case, with discounting, the problem either receives a linear programming treatment or is treated by the elegant and effective policy-iteration procedure by Ronald Howard. In the undiscounted case, however, a special form of this procedure is required, which detracts from the directness and elegance of the method. The difficulty comes in the step generally called the value-determination procedure. The equations used in this step are linearly dependent, so that the solution of the system of linear equations requires some adjustment. We propose a new computational procedure which avoids this difficulty and works directly with the average next-period gains and powers of the transition probability matrix. The fundamental computational tools are matrix multiplication and addition.
dc.format.extent10 pp
dc.identifier.citationPfeiffer, Paul E. and Dennis, J.E. Jr.. "A Computational Note on Markov Decision Processes Without Discounting." (1987) <a href="https://hdl.handle.net/1911/101629">https://hdl.handle.net/1911/101629</a>.
dc.identifier.digitalTR87-19
dc.identifier.urihttps://hdl.handle.net/1911/101629
dc.language.isoeng
dc.titleA Computational Note on Markov Decision Processes Without Discounting
dc.typeTechnical report
dc.type.dcmiText
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TR87-19.pdf
Size:
136.25 KB
Format:
Adobe Portable Document Format