blob: af6094001c4a61bab9b8fd2e57687872f1dc1b13 [file] [log] [blame]
Apache SystemML (incubating)
SystemML is now an Apache Incubator project! Please see the Apache SystemML
(incubating) website at for more information. The
latest project documentation can be found at the SystemML Documentation website
on GitHub at
SystemML is a flexible, scalable machine learning system. SystemML's
distinguishing characteristics are:
1. Algorithm customizability via R-like and Python-like languages.
2. Multiple execution modes, including Standalone, Spark Batch, Spark
MLContext, Hadoop Batch, and JMLC.
3. Automatic optimization based on data and cluster characteristics to ensure
both efficiency and scalability.
SystemML in Standalone Mode
Standalone mode can be run on a single machine in a non-Hadoop environment,
allowing data scientists to develop algorithms locally without need of a
distributed cluster. The Standalone release packages all required libraries
into a single distribution. Standalone mode is not appropriate for large
OS X and Linux users can use the script to run in
Standalone mode, while Windows users can use the runStandaloneSystemML.bat
Hello World Example
The following example will run a "hello world" DML script on SystemML in
Standalone mode.
$ echo 'print("hello world");' > helloworld.dml
$ ./ helloworld.dml
Running SystemML Algorithms
Several existing algorithms can be found in the scripts directory in the
Standalone distribution. In the following example, we first obtain Haberman's
Survival Data Set. We create a metadata file for this data. We create a
types.csv file that describes the type of each column along with a
corresponding metadata file. We then run the Univariate Statistics algorithm
on the data in Standalone mode. The results are output to the
data/univarOut.mtx file.
$ wget -P data/
$ echo '{"rows": 306, "cols": 4, "format": "csv"}' > data/
$ echo '1,1,1,2' > data/types.csv
$ echo '{"rows": 1, "cols": 4, "format": "csv"}' > data/types.csv.mtd
$ ./ scripts/algorithms/Univar-Stats.dml -nvargs X=data/ TYPES=data/types.csv STATS=data/univarOut.mtx
For more information, please see the online SystemML documentation.