Have anyone built a self-service platform, where t...
# platform-toolbox
d
Have anyone built a self-service platform, where teams can deploy their infrastructure autonomously e.g. on a cloud, with some limits defined on the scope what they can do? I'm just curious how this looks technically? Are teams happy with this? Are teams doing it fully by themselves or using a e.g. Terraform prepared from shared repo? How the standard is keeped (or it is not 😛 )?
r
Hi @Damian Keska, we have hundred of platform engineering teams using Qovery to make their developers more autonomous in the process of environment deployment and even infrastructure. Basically engineering teams prepared what we call a blueprint environment (template) composed of all the services that their developers would need. Then developers can clone this environment to create on demand environments. Same for the infrastructure with our Terraform provider. They usually create a Terraform module on top of our Terraform provider and then developers can bootstrap everything that they need.
Note: They usually put all their conf file into the same repo in our case. But it's possible to split them across multiple repos.
r
Depends on if you are expecting self-service users to know and understand how to build, own, and operate infrastructure. If the answer is, “they’re supposed to because DevOps”, something like TF Cloud (with sentinel) or atlantis with org-level policies is nice from an IaC perspective. If you’re looking to provide self-service, infra as a service, that’s a little bit more involved, though TF Cloud/Enterprise is moving this direction. A few ways my teams have done this in the past: • Atlantis + TF modules + Backstage ◦ Create the templates for teams to use so it’s a simple form, they instantiate your TF modules, then Atlantis owns the execution ◦ Paired with org policies to prevent bad things, like non-golden images, untagged resources, bad ports, etc • API-Driven Infrastructure ◦ Endpoints to provision and manage the lifecycle of cloud resources ◦ Typically bundled the “hard” or often not-thought of infra (like security groups and rules, which vpcs do i need to use, which iam role, etc) ◦ Can additionally tie in resources that TF doesn’t natively handle, like cloud foundry, internal APIs, etc ◦ Can make it event-driven or part of an event-driven workflow, including change requests, automated or manual approvals, etc ◦ Call via curl, programming lang, postman, create a form in backstage ◦ More development work is a con and it’s sometimes hard to keep up Lots of other options too, but I’ve seen these two be successful. I was going to give a poster presentation at PyCon 2020 about this, actually simple smile It was canceled, but here’s the proposal I put together: https://github.com/rorynscott/pycon-proposal-2020/blob/master/proposal.md
h
At a previous company I created a gitlab pipeline that used Ansible to deploy infra, worked fairly well
p
@Damian Keska we’ve written about the IDP we’ve built (and are building) with lots of technical details at https://engineering.getweave.com There are more posts in-progress.
m
I’m part of the team at Encore – it’s a backend framework that lets developers express high-level infrastructure requirements directly in code. It comes with a DevOps platform that automatically provisions dev/test/production environments in your cloud account. Because infrastructure is provisioned based on the application code’s needs, environments stay in sync, always. And the platform/devops team can control and configure the environments and infrastructure at will, without requiring developers to make code changes. Would love your feedback! ❤️
r
Hey @Marcus Kohlberg 👋 we (Qovery) are also from the Crane portfolio
m
Howdy! 👋
a
I have an opinion. Working with both terraform, kubernetes reconciler pattern and DAG workflow as an extension to infrastructure control plane. IAC like terraform can quickly become spaghetti code and most solutions around that problem provide some sort of abstractions. IIRC terraform cloud provides rbac with workspaces. Scaling is hard, as resource graph grows so is waiting time for plans and applies. States can go out of sync because it has to be managed manually. You can have custom controller for you infrastructure abstractions, and build them from scratch or use an opensource solution like crossplane. You can simply use kube rbac here. Scalability is guaranteed by kubeAPI and controllers. Reconciler pattern manages state. I do actually plan to do a POC comparing extension via terraform provider or kube controller. Will post result here.