**[AcockStavig1979]** Alan C. Acock and Gordon R. Stavig, A Measure of Association for Nonparametric Statistics, Social Forces, Oxford University Press, Volume 57, Number 4, June, 1979, 1381--1386.

**[AgrawalKSX2002]** Rakesh Agrawal and Jerry Kiernan and Ramakrishnan Srikant and Yirong Xu, Hippocratic Databases, Proceedings of the 28-th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China, August 20--23, 2002, 143--154.

**[Agresti2002]** Alan Agresti, Categorical Data Analysis, Second Edition, Wiley Series in Probability and Statistics, Wiley-Interscience 2002, 710.

**[AloiseDHP2009]** Daniel Aloise and Amit Deshpande and Pierre Hansen and Preyas Popat, NP-hardness of Euclidean Sum-of-squares Clustering, Machine Learning, Kluwer Academic Publishers, Volume 75, Number 2, May, 2009, 245--248.

**[ArthurVassilvitskii2007]** k-means++: The Advantages of Careful Seeding, David Arthur and Sergei Vassilvitskii, Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2007), January 7--9, 2007, New Orleans, LA, USA, 1027--1035.

**[Breiman2001]** L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.

**[BreimanFOS1984]** L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.

**[Chapelle2007]** Olivier Chapelle, Training a Support Vector Machine in the Primal, Neural Computation, 2007.

**[Cochran1954]** William G. Cochran, Some Methods for Strengthening the Common $\chi^2$ Tests, Biometrics, Volume 10, Number 4, December 1954, 417--451.

**[Collett2003]** D. Collett. Modelling Survival Data in Medical Research, Second Edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis, 2003.

**[Gill2000]** Jeff Gill, Generalized Linear Models: A Unified Approach, Sage University Papers Series on Quantitative Applications in the Social Sciences, Number 07-134, 2000, Sage Publications, 101.

**[Hartigan1975]** John A. Hartigan, Clustering Algorithms, John Wiley~&~Sons Inc., Probability and Mathematical Statistics, April 1975, 365.

**[Hsieh2008]** C-J Hsieh, K-W Chang, C-J Lin, S. S. Keerthi and S. Sundararajan, A Dual Coordinate Descent Method for Large-scale Linear SVM, International Conference of Machine Learning (ICML), 2008.

**[Lin2008]** Chih-Jen Lin and Ruby C. Weng and S. Sathiya Keerthi, Trust Region Newton Method for Large-Scale Logistic Regression, Journal of Machine Learning Research, April, 2008, Volume 9, 627--650.

**[McCallum1998]** A. McCallum and K. Nigam, A comparison of event models for naive bayes text classification, AAAI-98 workshop on learning for text categorization, 1998.

**[McCullagh1989]** Peter McCullagh and John Ashworth Nelder, Generalized Linear Models, Second Edition, Monographs on Statistics and Applied Probability, Number 37, 1989, Chapman & Hall/CRC, 532.

**[Nelder1972]** John Ashworth Nelder and Robert William Maclagan Wedderburn, Generalized Linear Models, Journal of the Royal Statistical Society, Series A (General), 1972, Volume 135, Number 3, 370--384.

**[Nocedal1999]** J. Nocedal and S. J. Wright, Numerical Optimization, Springer-Verlag, 1999.

**[Nocedal2006]** Optimization Numerical Optimization, Jorge Nocedal and Stephen Wright, Springer Series in Operations Research and Financial Engineering, 664, Second Edition, Springer, 2006.

**[PandaHBB2009]** B. Panda, J. Herbach, S. Basu, and R. J. Bayardo. PLANET: massively parallel learning of tree ensembles with mapreduce. PVLDB, 2(2):1426– 1437, 2009.

**[Russell2009]** S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, 2009.

**[Scholkopf1995]** B. Scholkopf, C. Burges and V. Vapnik, Extracting Support Data for a Given Task, International Conference on Knowledge Discovery and Data Mining (ICDM), 1995.

**[Stevens1946]** Stanley Smith Stevens, On the Theory of Scales of Measurement, Science June 7, 1946, Volume 103, Number 2684, 677--680.

**[Vetterling1992]** W. T. Vetterling and B. P. Flannery, Multidimensions in Numerical Recipes in C - The Art in Scientific Computing, W. H. Press and S. A. Teukolsky (eds.), Cambridge University Press, 1992.

**[ZhouWSP08]** Y. Zhou, D. M. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the Netflix prize. In Algorithmic Aspects in Information and Management, 4th International Conference, AAIM 2008, Shanghai, China, June 23-25, 2008. Proceedings, pages 337–348, 2008.