By default, Azure based deployments of Accumulo clusters provision a single Virtual Machine Scale Set - VMSS. A VMSS consists of a set of Virtual Machine instances, which are individually identified by their hostname and private IP address.
muchos launch
command automatically populates the nodes
section in muchos.props
with these hostnames and IP addresses based on the details of the VM instances in the VMSS. In the case of a single VMSS deployment, hard-coded assignment of a minimum (but sufficient) set of roles, to these nodes is done. As a result, deploying additional roles, such as Fluo, or Spark, is not possible unless the user manually edits the muchos.props
file after the muchos launch
command, and prior to running muchos setup
.To address the above challenges, Muchos supports a “multiple VMSS” mode of installation for Azure clusters. To use this mode, the user needs to:
use_multiple_vmss = True
in muchos.props
azure_multiple_vmss_vars.yml
file in the fluo-muchos/conf
folderIn such a case, the muchos launch
command will create multiple VMSS deployments in parallel, and later assign roles to the VM instances within each VMSS, based on the specification in the azure_multiple_vmss_vars.yml
file. Subsequently, muchos setup
runs without any modifications.
Muchos provides a sample file which can be used as a template to customize. The YAML file is a list of VMSS specifications. The following fields can be specified for each VMSS:
Attribute | Required or optional? | Default value | Description |
---|---|---|---|
name_suffix | Required | - | The name of each VMSS is constructed by concatenating the Muchos cluster name with this string. As an example, if your Muchos cluster is called test , and this field has a value of ldr , then the VMSS is created with a name test-ldr |
sku | Required | - | A string identifier specifying the Azure VM size. Refer to the Azure documentation to lookup these strings. An example VM size is Standard_D32s_v3 for a 32-vCPU Dsv3 VM |
vmss_priority | Optional | Regular | If this not specified at each VM level, the value for vmss_priority from the azure section in muchos.props is used |
perf_profile | Required | - | A string identifying a corresponding performance profile configuration section in muchos.props which contains perf profile parameters |
azure_disk_device_path | Optional | If not specified, the corresponding azure_disk_device_path value from the azure section in muchos.props is used | This is a device path used to enumerate attached SCSI or NVME disks to use for persistent local storage |
azure_disk_device_pattern | Optional | If not specified, the corresponding azure_disk_device_pattern value from the azure section in muchos.props is used | This is a device name wildcard pattern used (internally) in conjunction with azure_disk_device_path to enumerate attached SCSI or NVME disks to use for persistent local storage |
mount_root | Optional | If not specified, the corresponding mount_root value from the azure section in muchos.props is used | This is the folder in the file system where the persistent disks are mounted |
data_disk_count | Required | - | An integer value which specifies the number of persistent (managed) data disks to be attached to each VM in the VMSS. It can be 0 in specific cases - see notes on using ephemeral storage for details |
data_disk_sku | Required | - | Can be either Standard_LRS (for HDD) or Premium_LRS (for Premium SSD). At this time, we have not tested the use of Standard SSD or UltraSSD with Muchos |
data_disk_size_gb | Required | - | An integer value specifying the size of each persistent (managed) data disk in GiB |
data_disk_caching | Optional | ReadOnly | One of None, ReadOnly, or ReadWrite indicating the type of host caching to use for each persistent (managed) disk |
image_reference | Optional | If not specified, the corresponding azure_image_reference value from the azure section in muchos.props is used | Azure image reference defined as a pipe-delimited string. |
capacity | Required | - | An integer value specifying the number of VMs in this specific VMSS |
roles | Required | - | This is a dictionary (list of key-value pairs), each of which should be of the form muchos_role_name : integer count . See sample file for examples. the muchos launch command for Azure clusters uses this list to assign roles to hosts in a sequential fashion. For example, if a given VMSS has 3 zkfc role members and 2 namenode role members defined, host0 and host1 in the VMSS will be assigned both zkfc and namenode roles, and host2 in the VMSS will just be assigned a zkfc role |