You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/03/03 17:40:00 UTC

[jira] [Work logged] (BEAM-12812) Run Github Actions on GCP workers in the apache-beam-testing project

     [ https://issues.apache.org/jira/browse/BEAM-12812?focusedWorklogId=736177&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-736177 ]

ASF GitHub Bot logged work on BEAM-12812:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 03/Mar/22 17:39
            Start Date: 03/Mar/22 17:39
    Worklog Time Spent: 10m 
      Work Description: dannymartinm commented on a change in pull request #16511:
URL: https://github.com/apache/beam/pull/16511#discussion_r818902572



##########
File path: .github/gh-actions-self-hosted-runners/README.md
##########
@@ -0,0 +1,80 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+# GitHub Actions - Self-hosted Runners
+The current GitHub Actions workflows are being tested on multiple operating systems, such as Ubuntu, Windows and MacOS. The way to migrate these runners from GitHub to GCP is by implementing self-hosted runners, so we have started implementing them in both Ubuntu and Windows environments, going with Google Kubernetes Engine and Google Cloud Compute VMs instances respectively.
+
+In addition, we are working on researching the best way to implement the MacOS self-hosted runners.
+
+## Ubuntu
+Ubuntu Self-hosted runners are implemented using Google Kubernetes Engine with the following specifications:
+
+#### Node
+* Machine Type: 2-custom-6-18432
+* Disk Size: 100 GB
+* CPU: 6 vCPUs
+* Memory : 18 GB
+
+#### Pod
+* Image: $LOCAL_IMAGE_NAME LOCATION-docker.pkg.dev/PROJECT-ID/REPOSITORY/IMAGE:latest
+* CPU: 2
+* Memory: 1028 Mi
+* Volumes: docker.sock
+* Secret env variables: Kubernetes Secrets
+
+#### AutoScaling
+* Horizontal Pod Autoscaling
+  * 5-10 nodes
+  * HorizontalPodAutoscaler
+    * Min replicas: 10
+    * Max replicas: 20
+    * CPU utilization: 70%
+* Vertical Pod Autoscaling: updateMode: "Auto"
+
+
+## Windows
+Windows Virtual machines have the following specifications
+
+#### VM specifications
+* Machine Type: n2-standard-2
+* Disk Size: 70 GB
+* CPU: 2 vCPUs
+* Memory : 8 GB
+
+#### Instance group settings
+* Region: us-west1 (multizone)
+* Scale-out metric: 70% of CPU Usage.
+* Cooldown period: 300s
+
+#### Notes:
+At first glance we considered implementing Windows runners using K8s, however this was not optimal because of the following reasons:
+
+* VS Build tools are required for certain workflows, unfortunately official images that support this dependency are huge in size, reaching 20GB easily which is not an ideal case for k8S management.
+* Windows Subsystem For Linux(WSL) is a feature that allows to execute bash scripts inside Windows which removes tech debt by avoiding writing steps in powershell, but this feature is disabled with payload removed in Windows containers.
+
+
+## Self-Hosted Runners Architecture
+![Diagram](diagrams/self-hosted-runners-architecture.png)
+
+## Cronjob - Delete Unused Self-hosted Runners
+
+Depending on the termination event, sometimes the removal script for offline runners is not triggered correctly from inside the VMs or K8s pod, because of that an additional pipeline was created in order to clean up the list of GitHub runners in the group.
+
+This was implemented using a Cloud function subscribed to a Pub/Sub Topic, the topic is triggered through a Cloud Scheduler that is executed once per day, the function consumes a GitHub API to delete offline self-hosted runners from the organization retrieving the token with its service account to secrets manager.

Review comment:
       Hi Robert, 
   
   Thanks for your comments and review. We have added some links of the GCP resources, pending some of them  to be added when final resources are in place. 
   
   Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 736177)
    Time Spent: 5.5h  (was: 5h 20m)

> Run Github Actions on GCP workers in the apache-beam-testing project
> --------------------------------------------------------------------
>
>                 Key: BEAM-12812
>                 URL: https://issues.apache.org/jira/browse/BEAM-12812
>             Project: Beam
>          Issue Type: Task
>          Components: build-system
>            Reporter: Kiley Sok
>            Assignee: Daniela Martín
>            Priority: P2
>          Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Objective: migrate the runners of the GitHub CI/CD pipeline (GitHub Actions or GAs) over to the Apache Beam GCP infrastructure; GKE in this case.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)