17-08-07 Deep Problems with Machine Learning
Category: Idea Lists (Upon Request)
<!-- gdoc-inlined -->
1. Overemphasis on unsupervised learning due to conflation between humans giving labels to data manually and the math of supervised learning
- Prediction of future input states hierarchically across time allow for model based, common-sense aware counterfactual reasoning for planning and causal learning
- The fact that gradient boosting based on decision trees, which merely capture discontinuities without capturing continuous structure or being able to generalize non-locally, are state of the art on ~all datasets that don’t have compositional structure is a strong indictment of the state of machine learning.
- Representations in NLP are the heuristic result of thinking about context / co-occurrence
- Transfer learning is stuck on homogeneous data (pixels, frequencies, words)
- Absence of metadata-informed transfer learning over heterogeneous data
- Representations are undissected, and so transfer is crude
- Lack of emphasis on causal reasoning - Anticausal problems everywhere
- RL doesn’t model the world (model free)
- Model Free Natural Language Processing - doesn’t model the underlying cause of language, just the language itself
- Data is framed in 2D, vector = datapoint style
- Inability to do hierarchical learning
- Algorithms input paradigm is fixed
- The data we collect is a function of the data we anticipate we can use, which depends on our tools
- Lack of quality benchmarks outside of Vision (say, NLP and Babi tasks)
- Time series data representation is pigeonholed into 2D data representation because of the way we structured our algorithms
- The fact that evolutionary methods are close in performance to reinforcement learning is a strong indictment of our ability to learn in those environments.
Source: Original Google Doc