layout: default title: Reference Reading


Reference Reading

Here we provide references to books and courses about data analysis in general, which might also be helpful in the context of Mahout.

General Background Materials

Don‘t be overwhelmed by all the maths, you can do a lot in Mahout with some basic knowledge. The books will help you understand your data better, and ask better questions both of Mahout’s APIs, and also of the Mahout community. And unlike learning some particular software tool, these are skills that will remain useful decades later.

Some good introductory alternatives here are:

Once you have a grasp of the basics then there are a slew of great texts that you might consult:

For statistics related to machine learning, these are particularly helpful:

For matrix computations/decomposition/factorization etc.:

  • Peter V. O'Neil Introduction to Linear Algebra, great book for beginners (with some knowledge in calculus). It is not comprehensive, but, it will be a good place to start and the author starts by explaining the concepts with regards to vector spaces which I found to be a more natural way of explaining.
  • David S. Watkins Fundamentals of Matrix Computations
  • Matrix Computations is the classic text for numerical linear algebra. Can't go wrong with it - great for researchers.
  • Nick Trefethen‘s Numerical Linear Algebra. It’s a bit more approachable for practitioners. Many chapters on SVD, there are even chapters on Lanczos.

Books specifically on R:

In addition, you should see how to plot data well: