Sunday, March 11, 2012

Machine Learning for Hackers book review

Buy this book.  I got one for free via the O'Reilly review program,
but I'll probably buy a paper copy, just so that I can mark it
up and loan it out to others.

This book is everything it is advertised to be.  It has enough of
a statistics refresher to to bring the average hacker up to speed,
and then it dives right in, using R as the language of choice to
cover several common machine learning tasks.

It's not a gentle introduction to R, but code samples are
carefully explained (be prepared to look at R's documentation if
you aren't familiar with R, though).  The book doesn't teach R
programming, but it does cover several useful libraries for
machine learning (including mining textual data).  The authors give
good presentation advice, though (e.g., they  point out that a little
extra time given to the presentation can make the difference between
an amateurish presentation versus a professional one, and they show
the difference).

Two items deserve special note:  first, while the book was in press,
the API used to generate the data for the chapter on analyzing social
graphs was removed, and the authors had to make a decision to either go
with the existing data, or wait and see what new APIs were made available.
The authors chose to provide their sample data and go with their
example rather than wait.  That was a great choice, as developers have
to deal with the real world, where vendors remove and change APIs.  I was
impressed at how they handled that issue.

The second item is not-so-great: the section on the Support Vector Machine
was too cursory.  It read like an editor or reviewer had said "hey, you
should mention SVM," and so the authors added a section.  But that material
was not given the same level of treatment as other contents, and,
as a result, the book stops on a somewhat off note.  A better
choice would have been to simply skip that chapter completely.

Overall, though, this is a great book.  It's hands-on, filled with
useful and interesting examples and advice, and it will get you moving
towards solving your own machine learning problems.

[Disclaimer: I got this book for free as part of the Oreilly blogger review
program I was not required to write a positive review. The opinions I have
expressed are my own. I am disclosing this in accordance with the Federal
Trade Commission’s 16 CFR, Part 255 : “Guides Concerning the Use of
Endorsements and Testimonials in Advertising.”]

No comments: