Apache Airavata is a software framework for executing and managing computational jobs on distributed computing resources including local clusters, supercomputers, national grids, academic and commercial clouds. Airavata builds on general concepts of service oriented computing, distributed messaging, and workflow composition and orchestration. Airavata bundles a server package with an API, client software development Kits and a general purpose reference UI implementation.
If you’re a researcher, Airavata offers several ways to streamline your workflows:
Apache Airavata is composed of modular components spanning core services, data management, user interfaces, and developer tooling.
airavata
– Main resource management and task orchestration middlewareairavata-custos
– Identity and access management frameworkairavata-mft
– Managed file transfer servicesairavata-portals
– All frontends for airavataairavata-data-lake
– Data lake and storage backendairavata-data-catalog
– Metadata and search servicesairavata-docs
– Developer documentationairavata-user-docs
– End-user guidesairavata-admin-user-docs
– Admin-focused documentationairavata-custos-docs
– Custos documentationairavata-site
– Project websiteairavata-sandbox
– Prototypes and early-stage workairavata-labs
– Experimental projectsairavata-jupyter-kernel
– Jupyter integrationairavata-cerebrum
– Airavata for NeuroscienceAiravata is composed as 4 top-level services that work together to facilitate the full lifecycle of computational jobs.
(apache-airavata-api-server)
The Airavata API Server bootstraps the services needed to run/monitor computational jobs, access/share results of computational runs, and manage fine-grained access to computational resources.
Class Name:
org.apache.airavata.server.ServerMain
Command:bin/orchestrator.sh
The Orchestrator spins up 7 servers (of type org.apache.airavata.common.utils.IServer
) for external clients to run computational jobs from.
org.apache.airavata.api.server.AiravataAPIServer
)org.apache.airavata.db.event.manager.DBEventManagerRunner
)org.apache.airavata.registry.api.service.RegistryAPIServer
)org.apache.airavata.credential.store.server.CredentialStoreServer
)org.apache.airavata.sharing.registry.server.SharingRegistryServer
)org.apache.airavata.orchestrator.server.OrchestratorServer
)org.apache.airavata.service.profile.server.ProfileServiceServer
)Class Name:
org.apache.airavata.helix.impl.controller.HelixController
Command:bin/controller.sh
The Controller manages the step-by-step transition of task state on helix-side. It uses Apache Helix to track step start, completion, and failure paths, ensuring the next step starts upon successful completion or retrying the current step on failure.
Class Name:
org.apache.airavata.helix.impl.participant.GlobalParticipant
Command:bin/participant.sh
The participant synchronizes the helix-side state transition of a task with its concrete execution at airavata-side. The currently registered steps are: EnvSetupTask
, InputDataStagingTask
, OutputDataStagingTask
, JobVerificationTask
, CompletingTask
, ForkJobSubmissionTask
, DefaultJobSubmissionTask
, LocalJobSubmissionTask
, ArchiveTask
, WorkflowCancellationTask
, RemoteJobCancellationTask
, CancelCompletingTask
, DataParsingTask
, ParsingTriggeringTask
, and MockTask
.
Class Name:
org.apache.airavata.monitor.email.EmailBasedMonitor
Command:bin/email-monitor.sh
The email monitor periodically checks an email inbox for job status updates sent via email. If it reads a new email with a job status update, it relays that state-change to the internal MQ (KafkaProducer).
Class Name:
org.apache.airavata.monitor.realtime.RealtimeMonitor
Command:bin/realtime-monitor.sh
The realtime monitor listens to incoming state-change messages on the internal MQ (KafkaConsumer), and relays that state-change to the internal MQ (KafkaProducer). When a task is completed at the compute resource, the realtime monitor is notified of this.
Class Name:
org.apache.airavata.helix.impl.workflow.PreWorkflowManager
Command:bin/pre-wm.sh
The pre-workflow manager listens on the internal MQ (KafkaConsumer) to inbound tasks at pre-execution phase. When a task DAG is received, it handles the environment setup and data staging phases of the DAG in a robust manner, which includes fault-handling. All these happen BEFORE the task DAG is submitted to the controller, and subsequently to the participant.
Class Name:
org.apache.airavata.helix.impl.workflow.PostWorkflowManager
Command:bin/post-wm.sh
The post-workflow listens on the internal MQ (KafkaConsumer) to inbound tasks at post-execution phase. Once a task is received, it handles the cleanup and output fetching phases of the task DAG in a robust manner, which includes fault-handling. Once the main task completes executing, this is announced to the realtime monitor, upon which the post-workflow phase is triggered. Once triggered, it submits this state change to the controller.
(apache-airavata-file-server)
Class Name:
org.apache.airavata.file.server.FileServerApplication
Command: `bin/
The Airavata File Server is a lightweight SFTP wrapper running on storage nodes integrated with Airavata. It lets users securely access storage via SFTP, using Airavata authentication tokens as ephemeral passwords.
(apache-airavata-agent-service)
[NEW]Class Name:
org.apache.airavata.agent.connection.service.AgentServiceApplication
The Airavata Agent Service is the backend for launching interactive jobs using Airavata. It provide constructs to launch a custom “Agent” on a compute resource, that connects back to the Agent Service through a bi-directional gRPC channel. The Airavata Python SDK primarily utilizes the Agent Service (gRPC) and the Airavata API (Thrift) to submit and execute interactive jobs, spawn subprocesses, and create network tunnels to subprocesses, even if they are behind NAT.
(apache-airavata-research-service)
[NEW]Class Name:
org.apache.airavata.research.service.ResearchServiceApplication
The Airavata Research Service is the backend for the research catalog in Airavata. It provides the API to add, list, modify, and publish notebooks, repositories, datasets, and computational models in cybershuttle, and launch interactive remote sessions to utilize them in a research setting.
Before setting up Apache Airavata, ensure that you have:
Requirement | Version | Check Using |
---|---|---|
Java SDK | 17+ | java --version |
Apache Maven | 3.8+ | mvn -v |
Git | Latest | git -v |
First, clone the project repository from GitHub.
git clone https://github.com/apache/airavata.git cd airavata
Next, build the project using Maven.
# with tests (slower, but safer) mvn clean install # OR without tests (faster) mvn clean install -DskipTests
Once the project is built, four tar.gz
bundles will be generated in the ./distributions
folder.
├── apache-airavata-agent-service-0.21-SNAPSHOT.tar.gz ├── apache-airavata-api-server-0.21-SNAPSHOT.tar.gz ├── apache-airavata-file-server-0.21-SNAPSHOT.tar.gz └── apache-airavata-research-service-0.21-SNAPSHOT.tar.gz 1 directory, 4 files
Next, copy the deployment scripts and configurations into the ./distributions
folder.
cp -r dev-tools/deployment-scripts/ distribution cp -r vault/ distribution/vault tree ./distribution distribution ├── apache-airavata-agent-service-0.21-SNAPSHOT.tar.gz ├── apache-airavata-api-server-0.21-SNAPSHOT.tar.gz ├── apache-airavata-file-server-0.21-SNAPSHOT.tar.gz ├── apache-airavata-research-service-0.21-SNAPSHOT.tar.gz ├── distribution_backup.sh ├── distribution_update.sh ├── services_down.sh ├── services_up.sh └── vault ├── airavata_sym.jks ├── airavata-server.properties ├── airavata.jks ├── application-agent-service.yml ├── application-research-service.yml ├── client_truststore.jks ├── email-config.yaml └── log4j2.xml 2 directories, 16 files
What's in the vault?
airavata_sym.jks
, airavata.jks
- contains the keys used to secure SSH credentials, etc.client_truststore.jks
- contains the certificates (e.g., certbot fullchain.pem) used to secure network connections (TLS).email-config.yaml
- contains the email addresses observed by the email monitor.airavata-server.properties
- config file for the airavata api server.application-agent-service.yml
- config file for the airavata agent service.application-research-service.yml
- config file for the airavata research service.log4j2.xml
- contains the Log4j configuration for all airavata services.Next, start the services using the deployment scripts.
cd distribution ./distribution_update.sh ./services_up.sh
Voila, you are now running Airavata! You can now tail the server logs using multitail
(all logs) or tail
(specific logs).
multitail apache-airavata-*/logs/*.log
⚠️ Note: Docker deployment is experimental and not recommended for production use.
Before setting up Apache Airavata, ensure that you have:
Requirement | Version | Check Using |
---|---|---|
Java SDK | 17+ | java --version |
Apache Maven | 3.8+ | mvn -v |
Git | Latest | git -v |
Docker Engine | 20.10+ | docker -v |
Docker Compose | 2.0+ | docker compose version |
In your /etc/hosts
, point airavata.host
to 127.0.0.1
:
127.0.0.1 airavata.host
First, clone the project repository from GitHub.
git clone https://github.com/apache/airavata.git cd airavata
Next, build the project distribution using Maven.
# with tests (slower, but safer) mvn clean install # OR without tests (faster) mvn clean install -DskipTests
Next, build the containers and start them through compose.
# build the containers mvn docker:build -pl modules/distribution # start containers via compose docker-compose \ -f modules/distribution/src/main/docker/docker-compose.yml \ up -d # check whether services are running docker-compose ps
Service Endpoints:
airavata.host:8960
airavata.host:8962
airavata.host:8443
Stop Services:
docker-compose \ -f modules/ide-integration/src/main/containers/docker-compose.yml \ -f modules/distribution/src/main/docker/docker-compose.yml \ down
We welcome contributions from the community! Here's how you can help:
Learn More:
The easiest way to setup a development environment is to follow the instructions in the ide-integration README.Those instructions will guide you on setting up a development environment with IntelliJ IDEA.
org.apache.airavata.sharing.registry.migrator.airavata.AiravataDataMigrator
modules/deployment-scripts
modules/load-client
Get Help:
Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at: http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
See the LICENSE file for complete license details.