Install Ray on one (head) node.
sudo apt install -y python3-pip python3.12-venv python3 -m venv venv source venv/bin/activate pip3 install -U "ray[default]"
ray start --head --dashboard-host 0.0.0.0 --include-dashboard true
This is optional, if you go add Ray worker noeds, it becomes distributed.
Also Ray doesn't support MacOS multi-node cluster
ray start --address=127.0.0.1:6379
Clone the repo with the version that you want to test. Run maturin build --release in the virtual env.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh . "$HOME/.cargo/env"
pip3 install maturin
git clone https://github.com/apache/datafusion-ray.git cd datafusion-ray maturin develop --release
# Start a local cluster # ray.init(resources={"worker": 1}) # Connect to a cluster ray.init()
RAY_ADDRESS='http://127.0.0.1:8265' ray job submit --working-dir examples -- python3 tips.py
Install Ray on at least two nodes.
sudo apt install -y python3-pip python3.12-venv python3 -m venv venv source venv/bin/activate pip3 install -U "ray[default]"
ray start --head --dashboard-host 0.0.0.0 --include-dashboard true
Replace NODE_IP_ADDRESS with the address accessible in your distributed setup, which will be displayed after the previous step.
ray start --address={NODE_IP_ADDRESS}:6379
Clone the repo with the version that you want to test. Run maturin build --release in the virtual env.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh . "$HOME/.cargo/env"
pip3 install maturin
git clone https://github.com/apache/datafusion-ray.git cd datafusion-ray maturin develop --release
# Start a local cluster # ray.init(resources={"worker": 1}) # Connect to a cluster ray.init()
RAY_ADDRESS='http://{NODE_IP_ADDRESS}:8265' ray job submit --working-dir examples -- python3 tips.py