OpenAI Research Overview

Category: Technical

Read the original document

<!-- gdoc-inlined -->


By Jeremy Nixon [[email protected]]. Nov 2017.

Categories: Domain in which the paper’s innovation is novel.

  1. Reinforcement Learning

    1. Multi-Agent
    2. Exploration
    3. Imitation Learning
  2. Deep Learning

  3. Memory

  4. Program Learning

  5. Representation Learning

  6. Variational Inference

  7. Generative Models

  8. Evolution

  9. Applications

    1. Security / Safety
    2. Robotics
  10. Environments

  11. Reinforcement Learning

    1. Multi-Agent
      1. Learning with Opponent-Learning Awareness
        1. https://arxiv.org/abs/1709.04326
      2. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
        1. https://arxiv.org/abs/1706.02275
      3. Emergence of Grounded Compositional Language in Multi-Agent Populations
        1. https://arxiv.org/abs/1703.04908
    2. Exploration
      1. Parameter Space Noise for Exploration
        1. https://arxiv.org/abs/1706.01905
      2. UCB and InfoGain Exploration via Q-Ensembles
        1. https://arxiv.org/abs/1706.01502
      3. Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
        1. https://arxiv.org/abs/1611.04717
      4. VIME: Variational Information Maximizing Exploration
        1. https://arxiv.org/abs/1605.09674
    3. Imitation Learning
      1. Third-Person Imitation Learning
        1. https://arxiv.org/abs/1703.01703
      2. One-Shot Imitation Learning
        1. https://arxiv.org/abs/1703.07326
    4. RL2: Fast Reinforcement Learning via Slow Reinforcement Learning
      1. https://arxiv.org/abs/1611.02779
    5. Teacher-Student Curriculum Learning
      1. https://arxiv.org/abs/1707.00183
    6. Equivalence Between Policy Gradients and Soft Q-Learning
      1. https://arxiv.org/abs/1704.06440
    7. Prediction and Control with Temporal Segment Models
      1. https://arxiv.org/abs/1703.04070
  12. Deep Learning

    1. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
      1. https://arxiv.org/abs/1602.07868
  13. Memory

    1. Hindsight Experience Replay [Also, Reinforcement Learning]
      1. https://arxiv.org/pdf/1707.01495.pdf
  14. Program Learning

    1. Extensions and Limitations of the Neural GPU
      1. https://arxiv.org/abs/1611.00736
  15. Representation Learning

    1. Variational Lossy Autoencoder
      1. https://arxiv.org/abs/1611.02731
  16. Variational Inference

    1. Improving Variational Inference with Inverse Autoregressive Flow
      1. https://arxiv.org/abs/1606.04934
  17. Generative Models

    1. Generative Adversarial Networks
      1. InfoGan: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets [Also, Representation Learning]
        1. https://arxiv.org/abs/1606.03657
      2. Improved Techniques for Training GANs
        1. https://arxiv.org/abs/1606.03498
    2. On the Quantitative Analysis of Decoder-Based Generative Models
      1. https://arxiv.org/abs/1611.04273
    3. A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy Based Models [Also Reinforcement Learning]
      1. https://arxiv.org/pdf/1611.03852.pdf
    4. PixelCNN++: Improving the Pixel CNN with Discretized Logistic Mixture Likelihood and Other Modifications
      1. https://arxiv.org/abs/1701.05517
    5. Learning to Generate Reviews and Discovering Sentiment
      1. https://arxiv.org/abs/1704.01444
  18. Evolution

    1. Evolution Strategies as a Scalable Alternative to Reinforcement Learning
      1. https://arxiv.org/abs/1703.03864
  19. Applications

    1. Security / Safety
      1. Deep Reinforcement Learning from Human Preferences
        1. https://arxiv.org/abs/1706.03741
      2. Concrete Problems in AI Safety
        1. https://arxiv.org/abs/1606.06565
      3. Adversarial Attacks on Neural Network Policies
        1. https://arxiv.org/abs/1702.02284
      4. Adversarial Training Methods for Semi-Supervised Text Classification
        1. https://arxiv.org/abs/1605.07725
      5. Semi-Supervised Knowledge Transfer for Deep Learning from Private Training Data
        1. https://arxiv.org/abs/1610.05755
      6. Debate Amplification
        1. https://arxiv.org/pdf/1805.00899.pdf
          1. Robotics
      7. Domain Randomization for Transferring Deep NEural Networks from Simulation to the Real World
        1. https://arxiv.org/abs/1703.06907
      8. Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model
        1. https://arxiv.org/abs/1610.03518
  20. Environments

  21. Infrastructure for Deep Learning

    1. https://blog.openai.com/infrastructure-for-deep-learning/
  22. Universe

    1. https://blog.openai.com/universe/
  23. OpenAI Gym

    1. https://arxiv.org/abs/1606.01540

OpenAI Researchers

  1. Paul Christiano
  2. Ryan Lowe
  3. Jean Harb
  4. Pieter Abbeel
  5. Igor Mordatch x
  6. Matthias Plappert
  7. Rein Houthooft x
  8. Prafulla Dhariwal
  9. Szymon Sidor
  10. Richard Y. Chen
  11. Xi Chen
  12. Marcin Andrychowicz x
  13. John Schulman
  14. Alec Radford
  15. Rafal Jozefowicz
  16. Yan Duan
  17. Bradly C. Stadie
  18. Jonathan Ho
  19. Jonas Schneider
  20. Ilya Sutskever
  21. Wojciech Zaremba
  22. Rachel Fong
  23. Josh Tobin
  24. Alex Ray
  25. Nikhil Mishra
  26. Ian Goodfellow
  27. Tim Salimans
  28. Diederik P. Kingma
  29. Andrej Karpathy
  30. Yuri Burda
  31. Zain Shah
  32. Trevor Blackwell
  33. Vicki Cheung

Salaries of top employees [Pg. 28] Hours & Salaries of top employees [Pg. 7] OpenAI spent 11 million in 2016, 7 million on salary. For comparison, Deepmind spend 138 million in 2016.


Source: Original Google Doc

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?