OpenAI Research Overview

By Jeremy Nixon [[email protected]]. Nov 2017.

Categories: Domain in which the paper’s innovation is novel.

Reinforcement Learning
1. Multi-Agent
2. Exploration
3. Imitation Learning
Deep Learning
Memory
Program Learning
Representation Learning
Variational Inference
Generative Models
Evolution
Applications
1. Security / Safety
2. Robotics
Environments
Reinforcement Learning
1. Multi-Agent
  1. Learning with Opponent-Learning Awareness
    1. https://arxiv.org/abs/1709.04326
  2. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
    1. https://arxiv.org/abs/1706.02275
  3. Emergence of Grounded Compositional Language in Multi-Agent Populations
    1. https://arxiv.org/abs/1703.04908
2. Exploration
  1. Parameter Space Noise for Exploration
    1. https://arxiv.org/abs/1706.01905
  2. UCB and InfoGain Exploration via Q-Ensembles
    1. https://arxiv.org/abs/1706.01502
  3. Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
    1. https://arxiv.org/abs/1611.04717
  4. VIME: Variational Information Maximizing Exploration
    1. https://arxiv.org/abs/1605.09674
3. Imitation Learning
  1. Third-Person Imitation Learning
    1. https://arxiv.org/abs/1703.01703
  2. One-Shot Imitation Learning
    1. https://arxiv.org/abs/1703.07326
4. RL2: Fast Reinforcement Learning via Slow Reinforcement Learning
  1. https://arxiv.org/abs/1611.02779
5. Teacher-Student Curriculum Learning
  1. https://arxiv.org/abs/1707.00183
6. Equivalence Between Policy Gradients and Soft Q-Learning
  1. https://arxiv.org/abs/1704.06440
7. Prediction and Control with Temporal Segment Models
  1. https://arxiv.org/abs/1703.04070
Deep Learning
1. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
  1. https://arxiv.org/abs/1602.07868
Memory
1. Hindsight Experience Replay [Also, Reinforcement Learning]
  1. https://arxiv.org/pdf/1707.01495.pdf
Program Learning
1. Extensions and Limitations of the Neural GPU
  1. https://arxiv.org/abs/1611.00736
Representation Learning
1. Variational Lossy Autoencoder
  1. https://arxiv.org/abs/1611.02731
Variational Inference
1. Improving Variational Inference with Inverse Autoregressive Flow
  1. https://arxiv.org/abs/1606.04934
Generative Models
1. Generative Adversarial Networks
  1. InfoGan: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [Also, Representation Learning]
    1. https://arxiv.org/abs/1606.03657
  2. Improved Techniques for Training GANs
    1. https://arxiv.org/abs/1606.03498
2. On the Quantitative Analysis of Decoder-Based Generative Models
  1. https://arxiv.org/abs/1611.04273
3. A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy Based Models [Also Reinforcement Learning]
  1. https://arxiv.org/pdf/1611.03852.pdf
4. PixelCNN++: Improving the Pixel CNN with Discretized Logistic Mixture Likelihood and Other Modifications
  1. https://arxiv.org/abs/1701.05517
5. Learning to Generate Reviews and Discovering Sentiment
  1. https://arxiv.org/abs/1704.01444
Evolution
1. Evolution Strategies as a Scalable Alternative to Reinforcement Learning
  1. https://arxiv.org/abs/1703.03864
Applications
1. Security / Safety
  1. Deep Reinforcement Learning from Human Preferences
    1. https://arxiv.org/abs/1706.03741
  2. Concrete Problems in AI Safety
    1. https://arxiv.org/abs/1606.06565
  3. Adversarial Attacks on Neural Network Policies
    1. https://arxiv.org/abs/1702.02284
  4. Adversarial Training Methods for Semi-Supervised Text Classification
    1. https://arxiv.org/abs/1605.07725
  5. Semi-Supervised Knowledge Transfer for Deep Learning from Private Training Data
    1. https://arxiv.org/abs/1610.05755
  6. Debate Amplification
    1. https://arxiv.org/pdf/1805.00899.pdf
    2. 1. Robotics
  7. Domain Randomization for Transferring Deep NEural Networks from Simulation to the Real World
    1. https://arxiv.org/abs/1703.06907
  8. Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model
    1. https://arxiv.org/abs/1610.03518
Environments
Infrastructure for Deep Learning
1. https://blog.openai.com/infrastructure-for-deep-learning/
Universe
1. https://blog.openai.com/universe/
OpenAI Gym
1. https://arxiv.org/abs/1606.01540

OpenAI Researchers

Paul Christiano
Ryan Lowe
Jean Harb
Pieter Abbeel
Igor Mordatch x
Matthias Plappert
Rein Houthooft x
Prafulla Dhariwal
Szymon Sidor
Richard Y. Chen
Xi Chen
Marcin Andrychowicz x
John Schulman
Alec Radford
Rafal Jozefowicz
Yan Duan
Bradly C. Stadie
Jonathan Ho
Jonas Schneider
Ilya Sutskever
Wojciech Zaremba
Rachel Fong
Josh Tobin
Alex Ray
Nikhil Mishra
Ian Goodfellow
Tim Salimans
Diederik P. Kingma
Andrej Karpathy
Yuri Burda
Zain Shah
Trevor Blackwell
Vicki Cheung

Salaries of top employees [Pg. 28] Hours & Salaries of top employees [Pg. 7] OpenAI spent 11 million in 2016, 7 million on salary. For comparison, Deepmind spend 138 million in 2016.

Source: Original Google Doc