Applied Predictive Modeling
Category: Books
<!-- gdoc-inlined -->
APM delivers a high level understanding of many critical algorithms, the contexts in which they’re useful and the trade-offs that they make. The focus on the strengths and weaknesses of approaches, their comparative performance and on the practical aspects of data transformation was done with a focus on application.
This is by far the most useful modeling book that I’ve encountered. Much of this is because I was in position to get a lot out of it - mathematically it rarely goes into detail, preferring to explain the assumptions in clear english. This made for faster reading and also a faster, though less rigorous understanding of each technique. That said, having all of the right pointers from problems or structure in a dataset to the proper algorithm is invaluable. Texts like Bishop that rely on explaining with the math make information that can be communicated quickly and simply quite difficult to parse.
There were a number of important methods that I was completely unaware of (ROC curves, MARS, Partial Least Squares) that it exposed me to.
The first few chapters on preprocessing are also extremely useful. Despite having gone through courses on the topic, there are so many algorithms where it’s important to account for skewness, normalize and scale. And while I knew this was important for algorithms that relied on distance metrics (k-means, kNN, for example) I didn’t know it was necessary for numerical stability in nearly as many algorithms as it is. Scaling with Box-Cox is extremely useful, whereas in classes I was just taught to visualize the data and play around with log or sqrt scalings until something looked like it was good.
The detail on the upside and downside of each model as well as the insight into the functioning of each algorithm was extremely useful. Knowing exactly how a random forest or booster works was done in wonderful detail without being confusing or overly technical. Kuhn and Johnson’s explanations take concepts that could be extremely difficult to grasp and frame them in terms that are accessible.
Of all the machine learning books that I’ve read (Bishop’s Pattern Recognition and Machine Learning, Intro to Statistical Learning, Elements of Statistical Learning), this book did the best by far on the time investment vs outcome ratio. Highly recommend.
Source: Original Google Doc