commit | 7decbf3069e592414cfd184b98e424a503c09b6e | [log] [tgz] |
---|---|---|
author | Brian Loss <brianloss@apache.org> | Mon Nov 01 09:50:13 2021 -0400 |
committer | GitHub <noreply@github.com> | Mon Nov 01 09:50:13 2021 -0400 |
tree | 634c9e4fa5a377f8558a0c48e265f28f4e821a62 | |
parent | fc0e7e11f80b59f63c04b541559d430c73d03631 [diff] |
Add optional Application Insights on azure (#414) This PR allows for the Azure Application Insights Java agent to be configured to run with the manager and tablet servers. When specified, this will create an Application Insights object along with the cluster, and metrics will be sent there. The default configuration sends micrometer metrics, but the user could configure differently (e.g., to send certain JMX metrics). * Create workbook if either oms integration or app insights are used. * Create application insights resource as part of azure cluster setup * Configure Accumulo to run the application insights Java agent * Clean up parameter to azureDeployLinuxCounters.json template to remove duplication of the creation of the log analytics workspace name. * Save the log analytics workspace resource ID in muchos.props along with the workspace ID and key, since this value is needed to set up the application insights resource. * Fix CI failure--missing newline at EOF. * PR feedback * Activate insights agent on the gc process * Improve regex when az_appinsights_connection_string is substituted in muchos.props. * Ensure az_appinsights_connection_string is added after az_app_insights_version in muchos.props, or after the [azure] section header if az_app_insights_version isn't there.
Muchos automates setting up Apache Accumulo or Apache Fluo (and their dependencies) on a cluster
Muchos makes it easy to launch a cluster in Amazon's EC2 or Microsoft Azure and deploy Accumulo or Fluo to it. Muchos enables developers to experiment with Accumulo or Fluo in a realistic, distributed environment. Muchos installs all software using tarball distributions which makes its easy to experiment with the latest versions of Accumulo, Hadoop, Zookeeper, etc without waiting for downstream packaging.
Muchos is not recommended at this time for production environments as it has no support for updating and upgrading dependencies. It also has a wipe command that is great for testing but dangerous for production environments.
Muchos is structured into two high level components:
Check out Uno for setting up Accumulo or Fluo on a single machine.
Muchos requires the following common components for installation and setup:
cd ~ python3.6 -m venv env source env/bin/activate
ssh-agent
installed and running and ssh-agent forwarding. Note that this may also require the creation of SSH public-private key pair.eval $(ssh-agent -s) ssh-add ~/.ssh/id_rsa
Muchos requires the following for EC2 installations:
pip3 install awscli2 boto3 --upgrade
pip3 install awscli boto3 botocore
key.name
to name of your key pair in AWS.~/.aws
configured on your machine. Can be created manually or using aws configure.Muchos requires the following for Azure installations:
wget https://packages.microsoft.com/yumrepos/azure-cli/azure-cli-2.0.69-1.el7.x86_64.rpm sudo yum install azure-cli-2.0.69-1.el7.x86_64.rpm
When running Muchos under Ubuntu 18.04, checkout these tips.
The following commands will install Muchos, launch a cluster, and setup/run Accumulo:
git clone https://github.com/apache/fluo-muchos.git cd fluo-muchos/ cp conf/muchos.props.example conf/muchos.props vim conf/muchos.props # Edit to configure Muchos cluster ./bin/muchos launch -c mycluster # Launches Muchos cluster in EC2 or Azure ./bin/muchos setup # Set up cluster and start Accumulo
The launch
command will create a cluster with the name specified in the command (e.g. ‘mycluster’). The setup
command can be run repeatedly to fix any failures and will not repeat successful operations.
After your cluster is launched, SSH to it using the following command:
./bin/muchos ssh
Run the following command to terminate your cluster. WARNING: All cluster data will be lost.
./bin/muchos terminate
Please continue reading for more detailed Muchos instructions.
Before launching a cluster, you will need to complete the requirements above, clone the Muchos repo, and create muchos.props. If you want to give others access to your cluster, add their public keys to a file named keys
in your conf/
directory. During the setup of your cluster, this file will be appended on each node to the ~/.ssh/authorized_keys
file for the user set by the cluster.username
property.
You might also need to configure the aws_ami
property in muchos.props. Muchos by default uses a free CentOS 7 image that is hosted in the AWS marketplace but managed by the CentOS organization. If you have never used this image in EC2 before, you will need to go to the CentOS 7 product page to accept the software terms. If this is not done, you will get an error when you try to launch your cluster. By default, the aws_ami
property is set to an AMI in us-east-1
. You will need to changes this value if a newer image has been released or if you are running in different region than us-east-1
.
After following the steps above, run the following command to launch an EC2 cluster called mycluster
:
./bin/muchos launch -c mycluster
After your cluster has launched, you do not have to specify a cluster anymore using -c
(unless you have multiple clusters running).
Run the following command to confirm that you can ssh to the leader node:
./bin/muchos ssh
You can check the status of the nodes using the EC2 Dashboard or by running the following command:
./bin/muchos status
Before launching a cluster, you will need to complete the requirements for Azure above, clone the Muchos repo, and create your conf/muchos.props
file by making a copy of the muchos.props example. If you want to give others access to your cluster, add their public keys to a file named keys
in your conf/
directory. During the setup of your cluster, this file will be appended on each node to the ~/.ssh/authorized_keys
file for the user set by the cluster.username
property. You will also need to ensure you have authenticated to Azure and set the target subscription using the Azure CLI.
Muchos by default uses a CentOS 7 image that is hosted in the Azure marketplace. The Azure Linux Agent is already pre-installed on the Azure Marketplace images and is typically available from the distribution's package repository. Azure requires that the publishers of the endorsed Linux distributions regularly update their images in the Azure Marketplace with the latest patches and security fixes, at a quarterly or faster cadence. Updated images in the Azure Marketplace are available automatically to customers as new versions of an image SKU.
Edit the values in the sections within muchos.props as below Under the general
section, edit following values as per your configuration
cluster_type = azure
cluster_user
should be set to the name of the administrative userproxy_hostname
(optional) is the name of the machine which has access to the cluster VNETUnder the azure
section, edit following values as per your configuration:
resource_group
to provide the resource-group name for the cluster deployment. A new resource group with this name will be created if it doesn't already existvnet
to provide the name of the VNET that your cluster nodes should use. A new VNET with this name will be created if it doesn't already existsubnet
to provide a name for the subnet within which the cluster resources will be deployeduse_multiple_vmss
allows you to configure VMs with different CPU, memory, disk configurations for leaders and workers. To know more about this feature, please follow the doc.azure_image_reference
allows you to specify the CentOS image SKU in the format as shown below. To configure CentOS 8.x, please follow these steps.offer|publisher|sku|version| Ex: CentOS|OpenLogic|7_9|latest|
azure_proxy_image_reference
allows you to specify the CentOS image SKU that will be used for the optional proxy machine. If this property is not specified, then the value of azure_image_reference
will be used instead.numnodes
to change the cluster size in terms of number of nodes deployeddata_disk_count
to specify how many persistent data disks are attached to each node and will be used by HDFS. If you would prefer to use ephemeral / storage for Azure clusters, please follow these steps.vm_sku
to specify the VM size to use. You can choose from the available VM sizes.use_adlsg2
to use Azure Data Lake Storage(ADLS) Gen2 as datastore for Accumulo ADLS Gen2 Doc. Setup ADLS Gen2 as datastore for Accumulo.az_oms_integration_needed
to implement Log Analytics workspace, Dashboard & Azure Monitor Workbooks Create Log Analytics workspace. Create and Share dashboards. Azure Monitor Workbooks.az_use_app_insights
to configure an Azure Application Insights with your setup, and activate the application insights Java agent with the manager and tablet servers. Customize applicationinsights.json to meet your needs before executing muchos setup.Please refer to the muchos.props example for the full list of Azure-specific configurations - some of which have supplementary comments.
Within Azure the nodes
section is auto populated with the hostnames and their default roles.
After following the steps above, run the following command to launch an Azure VMSS cluster called mycluster
(where ‘mycluster’ is the name assigned to your cluster):
.bin/muchos launch -c `mycluster` # Launches Muchos cluster in Azure
Once your cluster is built in EC2 or Azure, the ./bin/muchos setup
command will set up your cluster and start Hadoop, Zookeeper & Accumulo. It will download release tarballs of Fluo, Accumulo, Hadoop, etc. The versions of these tarballs are specified in muchos.props and can be changed if desired.
Optionally, Muchos can setup the cluster using an Accumulo or Fluo tarball that is placed in the conf/upload
directory of Muchos. This option is only necessary if you want to use an unreleased version of Fluo or Accumulo. Before running the muchos setup
command, you should confirm that the hash (typically SHA-512 or SHA-256) of your tarball matches what is set in conf/checksums. Run the command shasum -a 512 /path/to/tarball
on your tarball to determine its hash. The entry in conf/checksums can optionally include the algorithm as a prefix. If the algorithm is not specified then Muchos will infer the algorithm based on the length of the hash. Currently Muchos supports using sha512 / sha384 / sha256 / sha224 / sha1 / md5 hashes for the checksum.
The muchos setup
command will install and start Accumulo, Hadoop, and Zookeeper. The optional services below will only be set up if configured in the [nodes]
section of muchos.props:
fluo
- Fluo only needs to be installed and configured on a single node in your cluster as Fluo applications are run in YARN. If set as a service, muchos setup
will install and partially configure Fluo but not start it. To finish setup, follow the steps in the ‘Run a Fluo application’ section below.
metrics
- The Metrics service installs and configures collectd, InfluxDB and Grafana. Cluster metrics are sent to InfluxDB using collectd and are viewable in Grafana. If Fluo is running, its metrics will also be viewable in Grafana.
spark
- If specified on a node, Apache Spark will be installed on all nodes and the Spark History server will be run on this node.
mesosmaster
- If specified, a Mesos master will be started on this node and Mesos slaves will be started on all workers nodes. The Mesos status page will be viewable at http://<MESOS_MASTER_NODE>:5050/
. Marathon will also be started on this node and will be viewable at http://<MESOS_MASTER_NODE>:8080/
.
client
- Used to specify a client node where no services are run but libraries are installed to run Accumulo/Hadoop clients.
swarmmanager
- Sets up Docker swarm with the manager on this node and joins all worker nodes to this swarm. When this is set, docker will be installed on all nodes of the cluster. It is recommended that the swarm manager is specified on a worker node as it runs docker containers. Check out Portainer if you want to run a management UI for your swarm cluster.
elkserver
- Sets up the Elasticsearch, Logstash, and Kibana stack. This allows logging data to be search, analyzed, and visualized in real time.
If you run the muchos setup
command and a failure occurs, you can repeat the command until setup completes. Any work that was successfully completed will not be repeated. While some setup steps can take over a minute, use ctrl-c
to stop setup if it hangs for a long time. Just remember to run muchos setup
again to finish setup.
The setup
command is idempotent. It can be run again on a working cluster. It will not change the cluster if everything is configured and running correctly. If a process has stopped, the setup
command will restart the process.
The ./bin/muchos wipe
command can be used to wipe all data from the cluster and kill any running processes. After running the wipe
command, run the setup
command to start a fresh cluster.
If you set proxy_socks_port
in your muchos.props, a SOCKS proxy will be created on that port when you use muchos ssh
to connect to your cluster. If you add a proxy management tool to your browser and whitelist http://leader*
, http://worker*
and http://metrics*
to redirect traffic to your proxy, you can view the monitoring & status pages below in your browser. Please note - The hosts in the URLs below match the configuration in [nodes] of muchos.prop.example
and may be different for your cluster.
mesosmaster
configured on leader1)mesosmaster
configured on leader1)Running an example Fluo application like WebIndex, Phrasecount, or Stresso is easy with Muchos as it configures your shell with common environment variables. To run an example application, SSH to a node on cluster where Fluo is installed and clone the example repo:
./bin/muchos ssh # SSH to cluster proxy node ssh <node where Fluo is installed> # Nodes with Fluo installed is determined by Muchos config hub clone apache/fluo-examples # Clone repo of Fluo example applications. Press enter for user/password.
Start the example application using its provided scripts. To show how simple this can be, commands to run the WebIndex application are shown below. Read the WebIndex README to learn more before running these commands.
cd fluo-examples/webindex ./bin/webindex init # Initialize and start webindex Fluo application ./bin/webindex getpaths 2015-18 # Retrieves CommonCrawl paths file for 2015-18 crawl ./bin/webindex load-s3 2015-18 0-9 # Load 10 files into Fluo in the 0-9 range of 2015-18 crawl ./bin/webindex ui # Runs the WebIndex UI
If you have your own application to run, you can follow the Fluo application instructions to configure, initialize, and start your application. To automate these steps, you can mimic the scripts of example Fluo applications above.
After ./bin/muchos setup
is run, users can install additional software on the cluster using their own Ansible playbooks. In their own playbooks, users can reference any configuration in the Ansible inventory file at /etc/ansible/hosts
which is set up by Muchos on the proxy node. The inventory file lists the hosts for services on the cluster such as the Zookeeper nodes, Namenode, Accumulo master, etc. It also has variables in the [all:vars]
section that contain settings that may be useful in user playbooks. It is recommended that any user-defined Ansible playbooks should be managed in their own git repository (see mikewalch/muchos-custom for an example).
Additionally, Muchos can be configured to provide High-Availability for HDFS & Accumulo components. By default, this feature is off, however it can be turned on by editing the following settings in muchos.props under the general
section as shown below:
hdfs_ha = True # default is False nameservice_id = muchoshacluster # Logical name for the cluster, no special characters
Before enabling HA, it is strongly recommended you read the Apache doc for HDFS HA & Accumulo HA
Also in the [nodes]
section of muchos.props ensure the journalnode
and zkfc
service are configured to run.
When hdfs_ha
is True
it also enables the ability to have HA resource managers for YARN. To utilize this feature, specify resourcemanager
for multiple leader nodes in the [nodes]
section.
If you launched your cluster, run the following command to terminate your cluster. WARNING - All data on your cluster will be lost:
./bin/muchos terminate
With the default configuration, clusters will not shutdown automatically after a delay and the default shutdown behavior will be stopping the node. If you would like your cluster to terminate after 8 hours, set the following configuration in muchos.props:
shutdown_delay_minutes = 480 shutdown_behavior = terminate
If you decide later to cancel the shutdown, run muchos cancel_shutdown
.
The config
command allows you to retrieve cluster configuration for your own scripts:
$ ./bin/muchos config -p leader.public.ip 10.10.10.10
We welcome contributions to the project. These notes should be helpful.
Muchos is powered by the following projects:
muchos launch
to start a cluster in AWS EC2.muchos setup
to install, configure, and start Fluo, Accumulo, Hadoop, etc on an existing EC2 or bare metal cluster.