| <!-- |
| * Licensed to the Apache Software Foundation (ASF) under one |
| * or more contributor license agreements. See the NOTICE file |
| * distributed with this work for additional information |
| * regarding copyright ownership. The ASF licenses this file |
| * to you under the Apache License, Version 2.0 (the |
| * "License"); you may not use this file except in compliance |
| * with the License. You may obtain a copy of the License at |
| * |
| * http://www.apache.org/licenses/LICENSE-2.0 |
| * |
| * Unless required by applicable law or agreed to in writing, |
| * software distributed under the License is distributed on an |
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| * KIND, either express or implied. See the License for the |
| * specific language governing permissions and limitations |
| * under the License. |
| --> |
| |
| <!-- |
| `mvn javadoc:javadoc` will compile the javadocs in the folder target/site/apidocs |
| --> |
| <html> |
| <body> |
| <h1>SystemDS Architecture</h1> |
| Algorithms in Apache SystemDS are written in a high-level R-like language called Declarative Machine learning Language (DML) |
| or a high-level Python-like language called PyDML. |
| SystemDS compiles and optimizes these algorithms into hybrid runtime |
| plans of multi-threaded, in-memory operations on a single node (scale-up) and distributed Spark operations on |
| a cluster of nodes (scale-out). SystemDS's high-level architecture consists of the following components: |
| |
| <h2>Language</h2> |
| DML (with either R- or Python-like syntax) provides linear algebra primitives, a rich set of statistical |
| functions and matrix manipulations, |
| as well as user-defined and external functions, control structures including parfor loops, and recursion. |
| The user provides the DML script through one of the following APIs: |
| <ul> |
| <li>Command-line interface ( {@link org.apache.sysds.api.DMLScript} )</li> |
| <li>Convenient programmatic interface for Spark users ( {@link org.apache.sysds.api.mlcontext.MLContext} )</li> |
| <li>Java Machine Learning Connector API ( {@link org.apache.sysds.api.jmlc.Connection} )</li> |
| </ul> |
| |
| {@link org.apache.sysds.parser.ParserWrapper} performs syntactic validation and |
| parses the input DML script using ANTLR into a |
| a hierarchy of {@link org.apache.sysds.parser.StatementBlock} and |
| {@link org.apache.sysds.parser.Statement} as defined by control structures. |
| Another important class of the language component is {@link org.apache.sysds.parser.DMLTranslator} |
| which performs live variable analysis and semantic validation. |
| During that process we also retrieve input data characteristics -- i.e., format, |
| number of rows, columns, and non-zero values -- as well as |
| infrastructure characteristics, which are used for subsequent |
| optimizations. Finally, we construct directed acyclic graphs (DAGs) |
| of high-level operators ( {@link org.apache.sysds.hops.Hop} ) per statement block. |
| |
| <h2>Optimizer</h2> |
| The SystemDS optimizer works over programs of HOP DAGs, where HOPs are operators on |
| matrices or scalars, and are categorized according to their |
| access patterns. Examples are matrix multiplications, unary |
| aggregates like rowSums(), binary operations like cell-wise |
| matrix additions, reorganization operations like transpose or |
| sort, and more specific operations. We perform various optimizations |
| on these HOP DAGs, including algebraic simplification rewrites ( {@link org.apache.sysds.hops.rewrite.ProgramRewriter} ), |
| intra-/{@link org.apache.sysds.hops.ipa.InterProceduralAnalysis} |
| for statistics propagation into functions and over entire programs, and |
| operator ordering of matrix multiplication chains. We compute |
| memory estimates for all HOPs, reflecting the memory |
| requirements of in-memory single-node operations and |
| intermediates. Each HOP DAG is compiled to a DAG of |
| low-level operators ( {@link org.apache.sysds.lops.Lop} ) such as grouping and aggregate, |
| which are backend-specific physical operators. Operator selection |
| picks the best physical operators for a given HOP |
| based on memory estimates, data, and cluster characteristics. |
| Individual LOPs have corresponding runtime implementations, |
| called instructions, and the optimizer generates |
| an executable runtime program of instructions. |
| |
| <h2>Runtime</h2> |
| We execute the generated runtime program locally |
| in CP (control program), i.e., within a driver process. |
| This driver handles recompilation, runs in-memory single-node |
| {@link org.apache.sysds.runtime.instructions.cp.CPInstruction} (some of which are multi-threaded ), |
| maintains an in-memory buffer pool, and launches Spark jobs if the runtime plan contains distributed computations |
| in the form of Spark instructions ( {@link org.apache.sysds.runtime.instructions.spark.SPInstruction} ). |
| For the Spark backend, we rely on Spark's lazy evaluation and stage construction. |
| CP instructions may also be backed by GPU kernels ( {@link org.apache.sysds.runtime.instructions.gpu.GPUInstruction} ). |
| The multi-level buffer pool caches local matrices in-memory, |
| evicts them if necessary, and handles data exchange between |
| local and distributed runtime backends. |
| The core of SystemDS's runtime instructions is an adaptive matrix block library, |
| which is sparsity-aware and operates on the entire matrix in CP, or blocks of a matrix in a distributed setting. Further |
| key features include parallel for-loops for task-parallel |
| computations, and dynamic recompilation for runtime plan adaptation addressing initial unknowns. |
| </body> |
| </html> |