Cloud Shell
:Assumed variables,
Name | Value |
---|---|
UserName | systemds-bot |
GroupName | systemds-group |
Create a user and a group, and join user to the created group.
[cloudshell-user@host ~]$ aws iam create-user --user-name systemds-bot { "User": { "Path": "/", "UserName": "systemds-bot", "UserId": "AIDAQSHHX7DDAODFXYZ3", "Arn": "arn:aws:iam::12345:user/systemds-bot", "CreateDate": "2021-04-10T20:36:59+00:00" } }
[cloudshell-user@host ~]$aws iam create-group --group-name systemds-group { "Group": { "Path": "/", "GroupName": "systemds-group", "GroupId": "AGPAQSHHX7DDB3XYZABCW", "Arn": "arn:aws:iam::12345:group/systemds-group", "CreateDate": "2021-04-10T20:41:58+00:00" } }
aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceRole --group-name systemds-group aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/service-role/AmazonElasticMapReduceforEC2Role --group-name systemds-group aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonElasticMapReduceFullAccess --group-name systemds-group aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AWSKeyManagementServicePowerUser --group-name systemds-group aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/IAMUserSSHKeys --group-name systemds-group # Grant cloud shell access too. aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AWSCloudShellFullAccess --group-name systemds-group # To create EC2 keys aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess --group-name systemds-group
aws iam add-user-to-group --user-name systemds-bot --group-name systemds-group
$ aws iam create-login-profile --generate-cli-skeleton > login-profile.json
login-profile.json
contains
{ "LoginProfile": { "UserName": "", "Password": "", "PasswordResetRequired": false } }
Create the credentials manually by editing login-profile.json
.
Name | Value |
---|---|
UserName | systemds-bot |
Password | For example, 9U*tYP |
PasswordResetRequired | false |
Now, create the login profile.
aws iam create-login-profile --cli-input-json file://login-profile.json
AWS CLI
:Create aws account / use your existing aws account
Install aws-cli
specific to your Operating System.
Create a user
Create a new user (https://console.aws.amazon.com/iam/home?#/users)
Create new group and add the following policies to it:
AmazonElasticMapReduceRole
AmazonElasticMapReduceforEC2Role
AdministratorAccess
AmazonElasticMapReduceFullAccess
AWSKeyManagementServicePowerUser
IAMUserSSHKeys
Configure your aws-cli (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html#cli-quick-configuration)
Spin up an EMR cluster with SystemDS
Put your SystemDS artifacts (dml-scripts, jars, config-file) in the directory systemds
Edit configuration in: systemds_cluster.config
Run: ./spinup_systemds_cluster.sh
Run a SystemDS script
Terminate the EMR cluster: ./terminate_systemds_cluster.sh
Finetune the memory
https://aws.amazon.com/blogs/big-data/best-practices-for-successfully-managing-memory-for-apache-spark-applications-on-amazon-emr/ https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark-configure.html#spark-defaults
Test if Scale to 100 nodes
Make the cluster WebUIs (Ganglia, SparkUI,..) accessible from outside
Integrate spot up instances