17-08-07 Deep Problems with Machine Learning

1. Overemphasis on unsupervised learning due to conflation between humans giving labels to data manually and the math of supervised learning

Prediction of future input states hierarchically across time allow for model based, common-sense aware counterfactual reasoning for planning and causal learning
The fact that gradient boosting based on decision trees, which merely capture discontinuities without capturing continuous structure or being able to generalize non-locally, are state of the art on ~all datasets that don’t have compositional structure is a strong indictment of the state of machine learning.
Representations in NLP are the heuristic result of thinking about context / co-occurrence
Transfer learning is stuck on homogeneous data (pixels, frequencies, words)
1. Absence of metadata-informed transfer learning over heterogeneous data
Representations are undissected, and so transfer is crude
Lack of emphasis on causal reasoning - Anticausal problems everywhere
RL doesn’t model the world (model free)
Model Free Natural Language Processing - doesn’t model the underlying cause of language, just the language itself
Data is framed in 2D, vector = datapoint style
1. Inability to do hierarchical learning
Algorithms input paradigm is fixed
The data we collect is a function of the data we anticipate we can use, which depends on our tools
Lack of quality benchmarks outside of Vision (say, NLP and Babi tasks)
Time series data representation is pigeonholed into 2D data representation because of the way we structured our algorithms
The fact that evolutionary methods are close in performance to reinforcement learning is a strong indictment of our ability to learn in those environments.

Source: Original Google Doc