Examples in Markov Decision Processes.
|Author / Creator:||Piunovskiy, A. B.|
|Imprint:||Singapore : World Scientific Publishing Company, 2012.|
|Description:||1 online resource (308 pages)|
|Series:||Imperial College Press Optimization Series ; v. 2|
Imperial College Press optimization series.
MATHEMATICS -- Probability & Statistics -- Stochastic Processes.
|URL for this record:||http://pi.lib.uchicago.edu/1001/cat/bib/11175931|
Table of Contents:
- Preface; 1. Finite-Horizon Models; 1.1 Preliminaries; 1.2 Model Description; 1.3 Dynamic Programming Approach; 1.4 Examples; 1.4.1 Non-transitivity of the correlation; 1.4.2 The more frequently used control is not better; 1.4.3 Voting; 1.4.4 The secretary problem; 1.4.5 Constrained optimization; 1.4.6 Equivalent Markov selectors in non-atomic MDPs; 1.4.7 Strongly equivalent Markov selectors in nonatomic MDPs; 1.4.8 Stock exchange; 1.4.9 Markov or non-Markov strategy? Randomized or not? When is the Bellman principle violated?; 1.4.10 Uniformly optimal, but not optimal strategy.
- 1.4.11 Martingales and the Bellman principle1.4.12 Conventions on expectation and infinities; 1.4.13 Nowhere-differentiable function vt(x); discontinuous function vt(x); 1.4.14 The non-measurable Bellman function; 1.4.15 No one strategy is uniformly -optimal; 1.4.16 Semi-continuous model; 2. Homogeneous Infinite-Horizon Models: Expected Total Loss; 2.1 Homogeneous Non-discounted Model; 2.2 Examples; 2.2.1 Mixed Strategies; 2.2.2 Multiple solutions to the optimality equation; 2.2.3 Finite model: multiple solutions to the optimality equation; conserving but not equalizing strategy.
- 2.2.4 The single conserving strategy is not equalizing and not optimal2.2.5 When strategy iteration is not successful; 2.2.6 When value iteration is not successful; 2.2.7 When value iteration is not successful: positive model I; 2.2.8 When value iteration is not successful: positive model II; 2.2.9 Value iteration and stability in optimal stopping problems; 2.2.10 A non-equalizing strategy is uniformly optimal; 2.2.11 A stationary uniformly -optimal selector does not exist (positive model); 2.2.12 A stationary uniformly -optimal selector does not exist (negative model).
- 2.2.13 Finite-action negative model where a stationary uniformly -optimal selector does not exist2.2.14 Nearly uniformly optimal selectors in negative models; 2.2.15 Semi-continuous models and the blackmailer's dilemma; 2.2.16 Not a semi-continuous model; 2.2.17 The Bellman function is non-measurable and no one strategy is uniformly -optimal; 2.2.18 A randomized strategy is better than any selector (finite action space); 2.2.19 The fluid approximation does not work; 2.2.20 The fluid approximation: refined model; 2.2.21 Occupation measures: phantom solutions.
- 2.2.22 Occupation measures in transient models2.2.23 Occupation measures and duality; 2.2.24 Occupation measures: compactness; 2.2.25 The bold strategy in gambling is not optimal (house limit); 2.2.26 The bold strategy in gambling is not optimal (inflation); 2.2.27 Search strategy for a moving target; 2.2.28 The three-way duel ("Truel"); 3. Homogeneous Infinite-Horizon Models: Discounted Loss; 3.1 Preliminaries; 3.2 Examples; 3.2.1 Phantom solutions of the optimality equation; 3.2.2 When value iteration is not successful: positive model.