Learning and Planning in Sequential Decision Problems

Talk
Yasin Abbasi-Yadkori
Queensland University of Technology
Talk Series: 
Time: 
02.17.2016 11:00 to 12:00
Location: 

AVW 4172

Many decision problems have an interactive nature; the decision maker executes an action, receives feedback from the environment, and finally uses the feedback to improve the next decision. Such sequential decision problems are particularly challenging when the decision and state spaces are large, which is often the case in many areas such as robotics, healthcare, and finance among others. In this talk, I will present my research in planning and learning in large sequential decision problems. The studied problems range from basic linear regression problem, to more complicated problems with limited feedback such as bandit linear optimization and linear quadratic (LQ) problem, to problems that are both computationally and statistically challenging such as reinforcement learning and Markov decision processes. I will present algorithms that are provably data-efficient and can be executed in real-time. I will start by discussing the challenges of the analysis of the least-squares method when the input depends on the past observations. In particular, I will show a tight data-dependent confidence set and the first sparsity confidence set for the linear regression problem. Equipped with these new confidence sets, I will demonstrate a data-efficient adaptive controller and show the first finite-time performance guarantee for the LQ problem. Finally, I will focus on computational aspects of the sequential decision problems, and I will discuss convex optimization reductions for Markov decision problems.