Expert course

Curiosity Driven Deep Reinforcement Learning

How Agents Can Learn In Environments With No Rewards

Rating: 4.4429 ratings2,451 students4 total hours27 lectures

Open course in DoJo Back to courses

Curiosity Driven Deep Reinforcement Learning

Expert

Course facts

Last updated 07/2023
Instructor: Phil Tabor
practical AI capability and workflow improvement

What you'll learn

Practical outcomes

How to Code A3C Agents
How to Do Parallel Processing in Python
How to Implement Deep Reinforcement Learning Papers
How to Code the Intrinsic Curiosity Module

Curriculum

5 sections • 27 lectures • 4h 9m total length

Introduction3 lectures • 12min

What You Will Learn in this Course05:25
How to Succeed in this Course03:44
Required Background, Software, and Hardware03:01

Fundamental Concepts4 lectures • 26min

A Brief Review of Deep Reinforcement Learning and Actor Critic Methods09:45
Code Review of Basic Actor Critic Agent09:17
A Crash Course in Asynchronous Advantage Actor Critic Methods05:41
Our Code Structure01:42

Paper Analysis: Asynchronous Methods for Deep Reinforcement Learning12 lectures • 2hr 5min

How to Read and Implement Research Papers06:54
A3C Paper: Abstract and Introduction04:15
Crash Course in Parallel Processing in Python11:39
A3C Paper: Related Work, Reinforcement Learning Background05:09
A3C Paper: The Asynchronous Reinforcement Learning Framework07:59
Coding our Actor Critic Network11:50
Learning with Generalized Advantage Estimation13:59
Coding a Minimalist Replay Memory03:27
Coding the Shared Adam Optimizer03:28
A3C Paper: Experiments and Discussion10:00
How to Modify the Open AI Gym Atari Environments19:27
Coding Our Main Loop and Evaluating Our Agent27:10

Paper Analysis: Curiosity Driven Exploration by Self Supervised Prediction6 lectures • 1hr 2min

Paper Overview01:17
ICM Paper: Abstract and Introduction06:45
ICM Paper: Curiosity Driven Exploration07:38
Experimental Setup and Coding Our ICM Module27:54
ICM Paper: Experiments, Related Work, and Discussion11:17
Setting Up the Mini World and Training Our ICM Agent06:51

Appendix2 lectures • 24min

Setting Up Our Virtual Environment for the New Open AI Gym09:09
Making Our Agents Compliant with the New Gym Interface14:31

Who it is for

This course is for advanced students of deep reinforcement learning

Course description

Overview

If reinforcement learning is to serve as a viable path to artificial general intelligence, it must learn to cope with environments with sparse or totally absent rewards. Most real life systems provided rewards that only occur after many time steps, leaving the agent with little information to build a successful policy on. Curiosity based reinforcement learning solves this problem by giving the agent an innate sense of curiosity about its world, enabling it to explore and learn successful policies for navigating the world. In this advanced course on deep reinforcement learning, motivated students will learn how to implement cutting edge artificial intelligence research papers from scratch. This is a fast paced course for those that are experienced in coding up actor critic agents on their own. We'll code up two papers in this course, using the popular PyTorch framework. The first paper covers asynchronous methods for deep reinforcement learning; also known as the popular asynchronous advantage actor critic algorithm (A3C). Here students will discover a new framework for learning that doesn't require a GPU. We will learn how to implement multithreading in Python and use that to train multiple actor critic agents in parallel. We will go beyond the basic implementation from the paper and implement a recent improvement to reinforcement learning known as generalized advantage estimation. We will test our agents in the Pong environment from the Open AI Gym's Atari library, and achieve nearly world class performance in just a few hours. From there, we move on to the heart of the course: learning in environments with sparse or totally absent rewards. This new paradigm leverages the agent's curiosity about the environment as an intrinsic reward that motivates the agent to explore and learn generalizable skills. We'll implement the intrinsic curiosity module (ICM), which is a bolt-on module for any deep reinforcement learning algorithm. We will train and test our agent in an maze like environment that only yields rewards when the agent reaches the objective. A clear performance gain over the vanilla A3C algorithm will be demonstrated, conclusively showing the power of curiosity driven deep reinforcement learning. Please keep in mind this is a fast paced course for motivated and advanced students. There will be only a very brief review of the fundamental concepts of reinforcement learning and actor critic methods, and from there we will jump right into reading and implementing papers. The beauty of both the ICM and asynchronous methods is that these paradigms can be applied to nearly any other reinforcement learning algorithm. Both are highly adaptable and can be plugged in with little modification to algorithms like proximal policy optimization, soft actor critic, or deep Q learning. Students will learn how to: Implement deep reinforcement learning papers Leverage multi core CPUs with parallel processing in Python Code the A3C algorithm from scratch Code the ICM from first principles Code generalized advantage estimation Modify the Open AI Gym Atari Library Write extensible modular code This course is launching with the PyTorch implementation, with a Tensorflow 2 version coming. I'll see you on the inside.

Instructor

Phil Tabor

Phil Tabor Machine Learning Engineer In 2012 I received my PhD in experimental condensed matter physics from West Virginia University. Following that I was a dry etch process engineer for Intel Corporation, where I leveraged big data to make essential process improvements for mission critical products. After leaving Intel in 2015, I have worked as a contract and freelance deep learning and artificial intelligence engineer.