Quick Start Code Example

This directory contains all the source code for tutorial.

Run everything (CPU path)

On Linux/macOS:

bash run_all_cpu.sh

On Windows:

run_all_cpu.bat

Compile and Distribute add_one_* manually

To compile the C++ Example:

cmake . -B build -DEXAMPLE_NAME="compile_cpu" -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build --config RelWithDebInfo

This produces build/add_one_cpu.so.

To compile CUDA Example (Linux with CUDA toolchain available):

cmake . -B build -DEXAMPLE_NAME="compile_cuda" -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build --config RelWithDebInfo

Load the Distributed add_one_*

To run library loading examples across ML frameworks (requires CUDA for the CUDA example):

python load/load_pytorch.py
python load/load_paddle.py
python load/load_numpy.py
python load/load_cupy.py

To run library loading example in C++:

cmake . -B build -DEXAMPLE_NAME="load_cpp" -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build --config RelWithDebInfo
build/load_cpp

The executable is emitted as build/load_cpp (build/load_cpp.exe on Windows).

For a CUDA end-to-end run, use:

bash run_all_cuda.sh