This message was deleted Platform Engineering #platform-toolbox

Join Slack

This message was deleted.

# platform-toolbox

Slackbot

06/03/2023, 8:04 AM

This message was deleted.

Heiðar Eldberg Eiríksson

06/03/2023, 9:15 AM

Although I highly recommend setting up an EKS cluster and running gitlab there I can understand K8s is an investment to get going. I’ve setup what I think you’re after using these instructions before and it was pretty fine, even supports spot instances for runners. https://docs.gitlab.com/runner/configuration/runner_autoscale_aws/ Take a good look at the different parameters you can configure under the MachineOptions block of your runner manager’s [runners.machine] config. Depending on the load pattern of your org/project(s) I also recommend “warming up” the autoscaling IdleCount before the jobs start. We experienced that the cluster became completely cold during the night and the first jobs of the day took a while to start while the runner manager started new machines.

אפי שטיין

06/03/2023, 2:16 PM

The problem with this solution is that each machine can run only one job If the start up time of a machine was quick that would not have been a problem, but it takes ages to start a machine (minutes) And it is not a viable solution for jobs that takes seconds to complete Not would I want to spawn 100 vms to run 100 jobs that can run easily on few machines

אפי שטיין

06/03/2023, 2:17 PM

What I’m looking for is machine auto scaling plus multiple concurrent jobs on each machine

Hugo Pinheiro

06/03/2023, 6:24 PM

We run our gitlab runners in our kubernetes clusters, we made it so it only runs on specific nodes, has been pretty bulletproof so far, same with the auto scalling

אפי שטיין

06/03/2023, 7:14 PM

I don’t have a k8s cluster and setting it up just for gitlab runner is not an option

Heiðar Eldberg Eiríksson

06/03/2023, 11:19 PM

Hm…I was sure it was possible when I did this but looking at this now it seems you are spot on. Looks like your best bet is to use an instance size appropriate for your runner requirements and just lots of them…cost wise that is going to be the same thing (actually could be cheaper if you can optimise this very well). i.e. if you need say 512mb for a job t3.nano or t3.micro are going to fit the bill and you pay for exactly what you use. If you have varying requirements based on different jobs you should be able to use a couple different instances types with different runner tags etc to orchestrate this quite efficiently.

Heiðar Eldberg Eiríksson

06/03/2023, 11:23 PM

I’m missing your main point of spinup time…yeah this is a pain. Sorry I couldn’t be of more help. Best of luck to you.

אפי שטיין

06/04/2023, 4:57 AM

Thanks for trying

Eric Irwin

06/05/2023, 2:52 PM

I know you mentioned your not using k8s and I would agree to not pick it up just for this, but in case anybody else comes across this — our teams built a cluster autoscaler that allowed us to reduce some of the pains around node autoscaling due to large spikes. It is similar to KEDA, but preceded it a bit and used a bit differently. It was a fun project and the same autoscaler is used for some of our other critical infrastructure now. https://medium.com/@eric.irwin/custom-autoscaling-for-gitlab-kubernetes-executors-cfbb90ec6094

אפי שטיין

06/05/2023, 2:54 PM

On the subject of gitlab runner using k8s executor (it’s for a different client of mine) They currently have a bug which causes the prepare environment step to take roughly 40 seconds So every job takes 40 seconds to start Anybody else has this problem? Knows how to solve it?

Hugo Pinheiro

06/05/2023, 3:00 PM

Only thing I changed for ours was use the ubuntu container image, the alpine one was causing issues all the time

אפי שטיין

06/05/2023, 3:01 PM

We see the same issue with all images The default is Ubuntu 20.04 But most jobs override it

162 Views

Open in Slack

Previous Next