Instructions:
Create aws account / use your existing aws account
Install aws-cli on your system
(https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2-linux.html)
Create a user
Create a new user (https://console.aws.amazon.com/iam/home?#/users)
Create new group and add the following policies to it:
AmazonElasticMapReduceRole
AmazonElasticMapReduceforEC2Role
AdministratorAccess
AmazonElasticMapReduceFullAccess
AWSKeyManagementServicePowerUser
IAMUserSSHKeys
Configure your aws-cli (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html#cli-quick-configuration)
Spin up an EMR cluster with SystemDS
Put your SystemDS artifacts (dml-scripts, jars, config-file) in the directory systemds
Edit configuration in: systemds_cluster.config
Run: ./spinup_systemds_cluster.sh
Run a SystemDS script
Terminate the EMR cluster: ./terminate_systemds_cluster.sh
Finetune the memory
https://aws.amazon.com/blogs/big-data/best-practices-for-successfully-managing-memory-for-apache-spark-applications-on-amazon-emr/ https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-configure.html#spark-defaults
Test if Scale to 100 nodes
Make the cluster WebUIs (Ganglia, SparkUI,..) accessible from outside
Integrate spot up instances