NeuroInspiration / Deep RL

Category: Machine Intelligence

Read the original document

Schedule / Agenda

Questions
Which neuroinspired ideas are underserved?
Which are misleading?
Relative safety?

Individual Notes

Todor Questions:

How should we be evaluating the value of marginal time spent engaging with neuroscience literature? Something like new research ideas generated per hour of time of engagement?

Creative Thoughts:

Notes:

Jeremy Questions:

Which neuroinspired ideas are underserved?
Which are misleading?
Relative safety?
How can we tell when improvements are actually due to neuroinspiration, or due to technical insights that are re-branded as neuroinspiration?
- Should the marginal research lab be founded around this idea?
Why is the function of actual neurons so underserved?
What parts of Neuroscience-Inspired AI do we disagree with?
- / the summary.

Creative Thoughts:

Notes:

Conversation Notes

Marr’s levels of analysis.

Todor: The original convolution paper from the 80s has this funny property that it describes the convolution. They tried to make it biologically plausible, and many of those things harmed performance. That hasn’t just happened with convolutions. They do a bunch of things that are ‘how the brain does it’. And then somebody comes in and removes things.

Jeremy: The task has to contain the data structure that each part of the brain is using if you want the algorithm to be a good test of the value of that neural inspiration.

Jeremy: It’s also common that the way that something is implemented doesn’t capture what is happening in the brain.

Todor: I’m not any more bullish on neuroscience inspired AI than with any other formal science inspired AI. I can list more neuroscience inspired things than in any other field. My broad point is that probably the useful part of neuroscience here is as an idea generator. But it’s probably not valuable to go in and try to make things as close as possible. There’s probably value in reading a bunch of neuroscience papers and getting a bunch of ideas. Going beyond that level of that detail probably isn’t super valuable. You’d get similar benefit if you chose to investigate most other hard technical fields. Anything form theoretical math, real analysis, probability, to… physics and electrical engineering. Actually, neuroscience is probably somewhat more useful than something like chemical engineering or bioengineering.

Jeremy: Marr’s would improve Toder’s point. Neuroscience’s relative value vs. objective value Omnipresence in our representation Pattern discovery as fully general Comparisons with other technical fields. Efficient researcher hypothesis.

Todor: I don’t buy the efficient researcher hypothesis idea. But I do buy that there’s more overlap between DL and comp neuro than DL and computational chemistry. There are certain other subfields like probability theory and optimization theory. The have a comparable amount of overlap. In terms of public discussion, there’s more discussion of neuroinspried AI.

Jeremy: Bias to look for neuroinspiration

Todor: If you look at aero-astro engineering in the 1900s, there wasn’t very good theory underlying it. It seems totally possible that this will happen in 10 or 20 years, where people figure out the theory.

Todor: My guess of the paper is, here’s some examples of neuroinspired ai, would be great is someone went and checked it.

Jeremy: Checking the neuroinspiration of Dropout?

Todor: You could just ask the author about what they were thinking of. You care about cases where a person had a useful idea because of neuroscience. You don’t care about the case where someone comes up with an idea through other means.

There’s a big difference between the initial generative process being neurinspried and going back after the fact.

Jeremy: Would you be happy with an ‘ideas per hour’ measurement of the value of a research literature? You could imagine an event where you spend 1 hour on 7 fields for a week, counting hte number and quality (for some subjective measure of quality) of the ideas.

Todor: There are bad effects with just doing 1 hour. You’d want to go up to 20 hours for each field. Over the course of the year you could explore 3-6 fields with initial projects. You could log the number of hours you spend and all of the ideas that you got. Todor: There’s a tradeoff between how many people you can get to do it and how much time you make do it.

Todor: Researchers can’t estimate their own thought with enough fidelity to give you a good answer. I’ve asked senior researchers about how they think about their research at the meta level. I tend to get shit answers.

Jeremy: Meta-cognition isn’t developed over the course of years in researchers..

Jeremy: Fully general arguments around attention and memory.

Todor: It’s possible that you don’t have to think about memory explicitly, and that current models can do it well. But there’s no rigorous way of answering these. This feels like a kind of availability bias.

Todor: Did a survey of 40 research in multiagent.

Jeremy: I’ve been meaning to do a superforecasting tournament with research. It would be very interesting to see if any researchers have expert research judgement.

Todor: I expect that people will be reasonably okay of making predictions of the type ‘will the change be over or under x%’. But if you ask them to forecast the percentage improvement or diff, they will be bad.

Jeremy: You could test this.

Todor: When you aggregate / do over under, you probably do better than predicting a value.

Todor: You can descretize these.

Jeremy: Prediction guided decision making has probably never been done.

Jeremy: Researchers stop generating ideas because they don’t have time.

Todor: They do still generate a lot more ideas than they can actually work on, and try to farm other ideas out.

Jeremy: What if their full time job was hypothesis generation.

Todor: I’d expect 5x to 100x in increased generation with full time.

Jeremy: I’m pretty confident that nobody has done this, but I don’t know why.

Todor: The cheap evaluation piece seems really important. I know you agree.

Todor: From all the things looking at the field as a whole, interpretability is the thing that people are not doing but should be doing. People don’t have the massive competitive disadvantage that they do in the other things.

Jeremy: Virtual brain analysis?

Todor: Yeah, that mapped to interpretability for me.

Todor: What kind of images maximally activate things. Clarity team

Jeremy: How much does the interpretability work affect researcher decision making?

Todor: I don’t think it has changed anything super big so far. In safety and applications, one approach to making the language not spit out racist things or mein kampf is to apply some of the clarity team’s approaches which tell you which parts of the activation space is doing what. And then you mask out the things that you don’t want.

Jeremy: Future prediction leads to counterfactual causal inference, as well as a more robust model.

Todor: Humans try to do this, and can do it in an intuitive way very quickly. In the limit, current RL systems can’t do it. In general, modeling causality is just a hard problem.

Jeremy: Training in sim?

Todor: This section makes the assumption that the AGI looks like a human brain to begin with? The framing of the question itself doesn’t make that much sense, given the way that neuroinspriation works in the real world. In the real world people implement ideas from neuroscience. Then they improve those in ways that may or may not be neuro-inspiration. The end system kinds of looks like a brain, but looks different at the lower level. If you condition on this picture, it doesn’t make sense to talk about its relative safety. It’s not the case that the techniques aren’t going to have contrasting points between them.

Todor: The relevant part about safety isn’t the neuro-inspired part, it is whether we hand engineer things or having it bootstrap itself.

From skimming this, it seems that a bunch of the things written up assume that this things looks like a brain. That looks implausible. For things that don’t assume that, they would be true of all systems that are assembled from many different pieces.

Todor: The entire picture of safety with engineered stuff vs. bootstrapped stuff… The main question is do you get to the bootstrapping system before or after general intelligence… There’s a question of when you get there, and do you want to get there before or after general intelligence? You probably have to solve a lot of the same problems, but your timing stuff ends up looking very different. This makes tactical decisions about what to spend resources on different.

Jeremy: You may be able to avoid bootstrapping entirely if you don’t set it up by default.

Todor: Do you think that humans are recursively self improving?

Jeremy: They update their training data, not their architecture or optimizer.

Todor: It’s not clear to me that humans won’t recursively self improve as effectively as AGIs.

Jeremy: I think that there are big differences between biological intelligence and software.

Todor: The model that I find most plausible is that you spend a lot of time in a world with general AIs which are at the level of random humans which can do a fun of thinks okay but which are not world class. And then you have specialized AIs that are good at specific things which aren’t good at anything else.

Jeremy’s Eval Examples of previous success of neuro-inspiration:

Reinforcement Learning
- Inspired by animal learning`
- TD Learning came out of animal behavior research.
- Second-order conditioning (Conditional Stimulus) (Sutton and Barto, 1981)
Deep Learning.
- Convolutional Neural Networks. Visual Cortex (V1)
  - Uses hierarchical structure (successive processing layers)
  - Neurons in the early visual systems responds strongly to specific patterns of light (say, precisely oriented bars) but hardly responds to many other patterns.
  - Gabor functions describe the weights in V1 cells.
  - Nonlinear Transduction
  - Divisive Normalization
- Word / Sentence Vectors - Distributed Embeddings
  - Parallel Distributed Processing in the brain for representation and computation
- Dropout
  - Stochasticity in neurons that fire with` Poisson-like statistics (Hinton 2012)
Attention
- Applying attention to memory
- Thought - it doesn’t make much sense to train an attention model over a static image, rather than over a time series. With a time series, bringing attention to changing aspects of the input makes sense.
Multiple Memory Systems
- Episodic Memory
  - Experience Replay
  - Especially for one shot experiences
- Working Memory
  - LSTM - gating allows for conditioning on current state
- Long-term Memory
  - External Memory
  - Gating in LSTM
Continual Learning
- Elastic weight consolidation for slowing down learning on weights that are important for previous tasks.

Examples of future success:

Intuitive Understanding of Physics
- Need to understand space, number, objectness
- Need to disentangle representations for transfer. (Dude, I feel so stolen from)
Efficient Learning (Learning from few examples)
Transfer Learning
- Transferring generalized knowledge gained in one context to novel domains
- Concept representations for transfer
  - No direct evidence of concept representations in brains
Imagination and Planning
- Toward model-based RL
- Internal model of the environment
  - Model needs to include compositional / disentangled representations for flexibility
- Implementing a forecasted-based method of action selection
- Monte-carlo Tree Search as simulation based planning
- In rat brains, we observe ‘preplay’ where rats imagine the likely future experience - measured by comparing neural activations at preplay to activations during the activity
- Generalization + Transfer in human planning
- Hierarchical Planning
Virtual Brain Analytics
- “However, by applying tools from neuroscience to AI systems, synthetic equivalents of single-cell recording, neuroimaging, and lesion techniques, we can gain insights into the key drivers of successful learning in AI research and increase the interpretability of these systems.”

Jeremy’s Adds:

Sparsity
Temporality

Todor’s Eval

Examples of previous success of neuro-inspiration:

Reinforcement Learning
- Inspired by animal learning`
- TD Learning came out of animal behavior research.
- Second-order conditioning (Conditional Stimulus) (Sutton and Barto, 1981)
Deep Learning.
- Convolutional Neural Networks. Visual Cortex (V1)
  - Uses hierarchical structure (successive processing layers)
  - Neurons in the early visual systems responds strongly to specific patterns of light (say, precisely oriented bars) but hardly responds to many other patterns.
  - Gabor functions describe the weights in V1 cells.
  - Nonlinear Transduction
  - Divisive Normalization
- Word / Sentence Vectors - Distributed Embeddings
  - Parallel Distributed Processing in the brain for representation and computation
- Dropout
  - Stochasticity in neurons that fire with` Poisson-like statistics (Hinton 2012)
Attention
- Applying attention to memory
- Thought - it doesn’t make much sense to train an attention model over a static image, rather than over a time series. With a time series, bringing attention to changing aspects of the input makes sense.
Multiple Memory Systems
- Episodic Memory
  - Experience Replay
  - Especially for one shot experiences
- Working Memory
  - LSTM - gating allows for conditioning on current state
- Long-term Memory
  - External Memory
  - Gating in LSTM
Continual Learning
- Elastic weight consolidation for slowing down learning on weights that are important for previous tasks.

Examples of future success:

Intuitive Understanding of Physics
- Need to understand space, number, objectness
- Need to disentangle representations for transfer. (Dude, I feel so stolen from)
Efficient Learning (Learning from few examples)
Transfer Learning
- Transferring generalized knowledge gained in one context to novel domains
- Concept representations for transfer
  - No direct evidence of concept representations in brains
Imagination and Planning
- Toward model-based RL
- Internal model of the environment
  - Model needs to include compositional / disentangled representations for flexibility
- Implementing a forecasted-based method of action selection
- Monte-carlo Tree Search as simulation based planning
- In rat brains, we observe ‘preplay’ where rats imagine the likely future experience - measured by comparing neural activations at preplay to activations during the activity
- Generalization + Transfer in human planning
- Hierarchical Planning
Virtual Brain Analytics
- “However, by applying tools from neuroscience to AI systems, synthetic equivalents of single-cell recording, neuroimaging, and lesion techniques, we can gain insights into the key drivers of successful learning in AI research and increase the interpretability of these systems.”

NeuroInpsiration Overview

Deep RL / Neuro-inspiration

Neuroscience Inspired Artificial Intelligence
Deepmind’s Path to Neuro-Inspired General Intelligence At a high level, the idea is to understand the brain at a level of detail (ex. Marr’s algorithmic level) that allows researchers to implement the brain’s functions as computational algorithms. Convolution is a front and center example of an algorithm with a neuroscientific basis. If each major module of the brain can be understood and satisfactorily implemented, their interaction will be sufficient for general problem solving.

Examples of previous success of neuro-inspiration:

Reinforcement Learning
- Inspired by animal learning`
- TD Learning came out of animal behavior research.
- Second-order conditioning (Conditional Stimulus) (Sutton and Barto, 1981)
Deep Learning.
- Convolutional Neural Networks. Visual Cortex (V1)
  - Uses hierarchical structure (successive processing layers)
  - Neurons in the early visual systems responds strongly to specific patterns of light (say, precisely oriented bars) but hardly responds to many other patterns.
  - Gabor functions describe the weights in V1 cells.
  - Nonlinear Transduction
  - Divisive Normalization
- Word / Sentence Vectors - Distributed Embeddings
  - Parallel Distributed Processing in the brain for representation and computation
- Dropout
  - Stochasticity in neurons that fire with` Poisson-like statistics (Hinton 2012)
Attention
- Applying attention to memory
- Thought - it doesn’t make much sense to train an attention model over a static image, rather than over a time series. With a time series, bringing attention to changing aspects of the input makes sense.
Multiple Memory Systems
- Episodic Memory
  - Experience Replay
  - Especially for one shot experiences
- Working Memory
  - LSTM - gating allows for conditioning on current state
- Long-term Memory
  - External Memory
  - Gating in LSTM
Continual Learning
- Elastic weight consolidation for slowing down learning on weights that are important for previous tasks.

Examples of future success:

Intuitive Understanding of Physics
- Need to understand space, number, objectness
- Need to disentangle representations for transfer. (Dude, I feel so stolen from)
Efficient Learning (Learning from few examples)
Transfer Learning
- Transferring generalized knowledge gained in one context to novel domains
- Concept representations for transfer
  - No direct evidence of concept representations in brains
Imagination and Planning
- Toward model-based RL
- Internal model of the environment
  - Model needs to include compositional / disentangled representations for flexibility
- Implementing a forecasted-based method of action selection
- Monte-carlo Tree Search as simulation based planning
- In rat brains, we observe ‘preplay’ where rats imagine the likely future experience - measured by comparing neural activations at preplay to activations during the activity
- Generalization + Transfer in human planning
- Hierarchical Planning
Virtual Brain Analytics

Relative Safety

Neuro-inspiration

This approach will likely involve the incremental combination of many valuable additions to a body of implemented functions. For example, combining an attention network with an episodic memory system with some world model.

Speed of Takeoff Likely a medium to slow takeoff, as the parts have to be developed independently of one another and can be tested. Each part of this system comes into use slowly. There’s some risk in a context where many modules are combined with each other, where a new interaction may lead to faster takeoff. (Rainbow, Human Level Performance in first-person multiplayer games, Max) Interpretability & Controllability While the interactions between neural sub-systems may be complicated and have implications for the model’s interpretability and controllability, the fundamental building blocks are constructed by engineers and scientists that can reason about the components and their expected behavior. Ease of verification Easier due to slower takeoff. Harder due to the brain’s parallel processing, complex temporal dynamics, and fast uninterpretable processes. Ease of validation Same as verification - easier due to slower takeoff. Harder due to the brain’s parallel processing, complex temporal dynamics, and fast uninterpretable processes.

Likelihood of Reward Function Hacking Capabilities may grow slowly enough to detect and eliminate this. There are plenty of examples of reward function hacking with existing systems built in this paradigm. Likelihood of Treacherous Turn Relatively high - learning strategic behavior / theory of mind will be on the neurinspired roadmap. Because there’s time for this kind of agent to have continuous interactions with programmers, there’s much more surface area for opportunities to learn deception and for deception having value. Interaction with Competition Allows for possible collaboration / coordination between competitors. Also allows competitors to react to visible progress on a time scale that makes hacking, military threats, economic embargoes etc. relevant. Power of system at sub-general intelligence level Could be quite high, solving perception and language tasks which allow for the automation and creation of important intellectual work. Difficulty of value alignment. Relatively easy, given similarities to human cognition. The potential to intentionally slow or halt progress while alignment is made more certain improves the chances of value alignment. Robustness to Distributional Shift, Small Alterations, Hacking, Hardware Faults, Software Bugs, Changes in Scale, Adversaries Adversarial examples exist. Bugs are likely. Plenty of opportunities for failure on all fronts, with greater time for hacking. Probability of creating a general intelligence High to medium high that an approach like this eventually succeeds.

Source: Original Google Doc