Multi-armed restless bandits, index policies, and dynamic priority allocation

José Niño-Mora
Department of Statistics
Carlos III University of Madrid
Esta dirección de correo electrónico está protegida contra los robots de spam, necesita tener Javascript activado para poder verla

  • Abstract
    This paper presents a brief introduction to the emerging research field of multi-armed restless bandits (MARBs), which substantially extend the modeling power of classic multi-armed bandits. MARBs are Markov decision process models for optimal dynamic priority allocation to a collection of stochastic binary-action (active/passive) projects evolving over time. Interest in MARBs has grown steadily, spurred by the breadth of their possible applications. Although MARBs are generally intractable, a Lagrangian relaxation and decomposition approach yields a unifying design principle for heuristic priority-index policies, which are often tractable and nearly optimal, along with an upper bound on the optimal reward.
  • KeywordsRestless bandits, index policies, priorities, stochastic scheduling.
  • AMS Subject classifications: 90B36, 90C40, 90B05, 90B22, 90B18.
  • PDF PDF (488.32 KB)