[SYSTEMDS-3902] Sparse data transfer: Python --> Java This commit implements optimized data transfer for Scipy sparse matrices from Python to the Java runtime. Key changes include the addition of `convertSciPyCSRToMB` and `convertSciPyCOOToMB` in the Java utility layer to directly handle compressed sparse row and coordinate formats. On the Python side, the `SystemDSContext` now supports a `sparse_data_transfer` flag and a new `from_py` method to unify data ingestion. These updates allow sparse data to be transferred without being converted to dense arrays, improving efficiency. Additionally, several data conversion methods were refactored for better maintenance. Closes #2379.
Overview: Apache SystemDS is an open-source machine learning (ML) system for the end-to-end data science lifecycle from data preparation and cleaning, over efficient ML model training, to debugging and serving. ML algorithms or pipelines are specified in a high-level language with R-like syntax or related Python and Java APIs (with many builtin primitives), and the system automatically generates hybrid runtime plans of local, in-memory operations and distributed operations on Apache Spark. Additional backends exist for GPUs and federated learning.
| Resource | Links |
|---|---|
| Quick Start | Install, Quick Start and Hello World |
| Documentation: | SystemDS Documentation |
| Python Documentation | Python SystemDS Documentation |
| Issue Tracker | Jira Dashboard |
Status and Build: SystemDS is renamed from SystemML which is an Apache Top Level Project. To build from source visit SystemDS Install from source