Windows Virtual machines have the following specifications
At first glance we considered implementing Windows runners using K8s, however this was not optimal because of the following reasons:
In order to monitor the Self-hosted Runners status, we have implemented a separate GitHub Actions workflow using GitHub-hosted runners, this workflow periodically calls a Cloud Function that serves data regarding the number of active
and offline
runners. In case of failure this workflow will send an email alert to the dev distribution email dev@beam.apache.org
.
The Cloud Function uses the endpoints provided by the installed GitHub App to retrieve information about the runners.
Depending on the termination event, sometimes the removal script for offline runners is not triggered correctly from inside the VMs or K8s pod, because of that an additional pipeline was created in order to clean up the list of GitHub runners in the group.
This was implemented using a GCP Cloud Function [code] subscribed to a Pub/Sub topic, the topic is triggered through a Cloud Scheduler that is executed once per day, the function consumes a GitHub API to delete offline self-hosted runners from the organization retrieving the token with its service account to secrets manager.