该博客用于记录自己学习深度强化学习及相关领域知识时的一些思考。
Posts
Tiny Project 2: Snake Game AI
Tutorial 3: RLlib (4) — Reinforcement Learning with RLlib in the Unity Game Engine
Tutorial 3: RLlib (3) — Scaling Multi-Agent Reinforcement Learning
Tutorial 3: RLlib (2) — A Gentle RLlib Tutorial
Tutorial 3: RLlib (1) — RLlib in 60 seconds
Paper 56: Unity: A General Platform for Intelligent Agents
Paper 55: AndroidEnv: A Reinforcement Learning Platform for Android
Paper 54: FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance
Tiny Project 1: Using Reinforcement Learning to Trade Stocks
Tutorial 2: Stable Baselines
Tutorial 1: Creating a Custom OpenAI Gym Environment for Stock Trading
Concept 9: Deep Reinforcement Learning for Trading (2)
Concept 9: Deep Reinforcement Learning for Trading (1)
Paper 53: Large-Scale Study of Curiosity-Driven Learning
Paper 52: Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning
Concept 8: Dual Gradient Descent
Paper 51: Learning to Walk in the Real World with Minimal Human Effort
Paper 50: Soft Q-Network (SQN)
Paper 49: Reinforcement Learning with Deep Energy-Based Policies (Soft Q Learning)
Paper 48: Soft Actor-Critic Algorithms and Applications (SAC)
Speech 13: QuantCon 2018-Tom Starke: Reinforcement Learning for Trading Practical Examples and Lessons Learned
Paper 47: Rainbow: Combining Improvements in Deep Reinforcement Learning
Paper 46: A Distributional Perspective on Reinforcement Learning (Categorical DQN)
Paper 45: Parameter Space Noise for Exploration
Paper 44: Noisy Networks for Exploration (Noisy DQN)
Paper 43: Dueling Network Architectures for Deep Reinforcement Learning (Dueling DQN)
Paper 42: A Solution to China Competitive Poker Using Deep Learning
Speech 12: Unity-Jeffrey Shih: Successfully Use Deep Reinforcement Learning in Testing and NPC Development
Paper 41: Attention Is All You Need
Paper 40: Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Paper 39: Deep Attention Recurrent Q-Network (DARQN)
Paper 38: Recurrent Models of Visual Attention (RAM)
Concept 7: Embedding
Paper 37: Dota 2 with Large Scale Deep Reinforcement Learning (OpenAI Five) — Appendix
Paper 37: Dota 2 with Large Scale Deep Reinforcement Learning (OpenAI Five) — Main Text
Paper 36: Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods
Paper 35: Human-level Control through Deep Reinforcement Learning (DQN2015)
Paper 34: Playing Atari with Deep Reinforcement Learning (DQN2013)
Speech 11: Webinar: Defeating Bots with Machine Learning
Speech 10: Webinar: Optimize Your Game Architecture for AI
Speech 9: Webinar: Game Playing Bots for Game Development
Paper 33: Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
Paper 32: A Closer Look at Deep Policy Gradients
Paper 31: Deep Reinforcement Learning that Matters
Paper 30: High-dimensional Continuous Control using Generalized Advantage Estimation (GAE)
Paper 29: Prioritized Experience Replay (PER)
Speech 8: OpenAI-Ilya Sutskever: Meta-Learning and Self-Play
Speech 7: DeepMind-Demis Hassabis: The Power of Self-Learning Systems
Paper 28: Distributed Distributional Deterministic Policy Gradients (D4PG)
Paper 27: Emergence of Locomotion Behaviours in Rich Environments (DPPO)
Paper 26: GA3C: GPU-based A3C for Deep Reinforcement Learning
Paper 25: Acme: A Research Framework for Distributed Reinforcement Learning
Paper 24: RLlib: Abstractions for Distributed Reinforcement Learning
Speech 6: NetEase-Tangjie Lv: Using Reinforcement Learning to Develop Game AI
Paper 23: Ray: A Distributed Framework for Emerging AI Applications
Speech 5: ScaledML 2020-Andrej Karpathy: AI for Full-Self Driving
Paper 22: Google Research Football: A Novel Reinforcement Learning Environment
Paper 21: SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
Paper 20: Making Efficient Use of Demonstrations to Solve Hard Exploration Problems (R2D3)
Paper 19: Deep Recurrent Q-Learning for Partially Observable MDPs (DRQN)
Paper 18: Recurrent Experience Replay in Distributed Reinforcement Learning (R2D2)
Concept 6: RNN, LSTM and GRU
Speech 4: NeurIPS 2018-Joelle Pineau: Reproducible, Reusable, and Robust Reinforcement Learning
Speech 3: NeurIPS 2019-Katja Hofmann: Reinforcement Learning Past, Present, and Future Perspective
Speech 2: KHIPU 2019-Nando de Freitas: Reinforcement Learning
Paper 17: Distributed Prioritized Experience Replay (Ape-X)
Paper 16: IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
Paper 15: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor (SAC)
Paper 14: Addressing Function Approximation Error in Actor-Critic Methods (TD3)
Paper 13: Deep Reinforcement Learning with Double Q-learning (Double DQN)
Paper 12: Continuous Control with Deep Reinforcement Learning (DDPG)
Paper 11: Deterministic Policy Gradient Algorithms (DPG)
Paper 10: Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments (MADDPG)
Paper 9: Hierarchical Reinforcement Learning for Multi-agent MOBA Game (Honour of Kings)
Paper 8: Playing FPS Games with Deep Reinforcement Learning (ViZDoom)
Paper 7: Mastering Complex Control in MOBA Games with Deep Reinforcement Learning (Honour of Kings)
Paper 6: Hierarchical Macro Strategy Model for MOBA Game AI (Honour of Kings)
Paper 5: Curiosity-driven Exploration by Self-supervised Prediction (ICM)
Paper 4: Asynchronous Methods for Deep Reinforcement Learning (A3C)
Speech 1: DLRLSS 2019-James Wright: Multi-Agent Systems
Paper 3: Scalable Trust-Region Method for Deep Reinforcement Learning using Kronecker-Factored Approximation (ACKTR)
Paper 2: Proximal Policy Optimization Algorithms (PPO)
Paper 1: Trust Region Policy Optimization (TRPO)
Concept 5: Kullback–Leibler Divergence
Concept 4: Monte Carlo Tree Search
Concept 3: Artificial General Intelligence
Concept 2: Multimodal distribution
Concept 1: Cross entropy
Hello World
subscribe via RSS