layout: base title: SystemDS Documentation

SystemDS is a flexible, scalable machine learning system. SystemDS's distinguishing characteristics are:

  1. Algorithm customizability via R-like and Python-like languages.
  2. Multiple execution modes, including Spark MLContext, Spark Batch, Standalone, and JMLC.
  3. Automatic optimization based on data and cluster characteristics to ensure both efficiency and scalability.

This version of SystemDS supports: Java 8+, Python 3.5+, Hadoop 2.6+ (Not 3.X), and Spark 2.1+ (Not 3.X) Nvidia CUDA 10.2 (CuDNN 7.x) Intel MKL (<=2019.x).

Links

Various forms of documentation for SystemDS are available.

  • a DML Language Reference for an list of operations possible inside SystemDS.
  • Builtin Functions contains a collection of builtin functions providing an high level abstraction on complex machine learning algorithms.
  • Algorithm Reference contains specifics on algorithms supported in systemds.
  • Entity Resolution provides a collection of customizable entity resolution primitives and pipelines.
  • Run SystemDS contains an Helloworld example along with an environment setup guide.
  • Instructions on python can be found at Python Documentation
  • The JavaDOC contains internal documentation of the system source code.
  • Install from Source guides through setup from git download to running system.
  • If you want to contribute take a look at Contributing
  • R to DML walks through the basics of converting a script from R to dml.