This message was deleted.
# general
s
This message was deleted.
i
Hi @Romaric Philogène, thanks for the interesting write-up! I've noticed the platform orchestrator existence for a while now but it remains quite a 'misty' concept to me. Maybe it's because it all remains rather abstract and I need a concrete example. A couple of questions: 1. Are there any open-source examples of platform orchestrators? I think you point to commercial ones? 2. Who is the intended audience for a platform orchestrator? Platform developers or developers in the business/stream-aligned teams? 3. Who builds the platform orchestrator if not the team that is creating an IDP? 4. If you are creating an IDP tailored to your organisation, do you really need an orchestrator abstraction? 5. I understand the need for glueing tools together to provide a seamless experience for the platform users, but isn't this just introducing a complex piece of software (?) while you could also use something like Terraform directly? You might not get a fancy API but you could still make it easy via hcl tfvars files that provide a simple interface to complex infra. You would not have to buy/develop an orchestrator in this case. 6. Can a platform orchestrator be generic or is it always tailored to a particular environment? Thanks for any answers/insights you could provide.
r
Hi @Ivo, can you let me take your questions and create another article where I cover them? Those are great questions 👍
i
Sure thing, I sort of expected it not to be one-liner answers 😄
r
I like providing detailed answers and to be able to share them to anyone (this slack does not have history 😄 )
a
Thanks for sharing the post Romeric! Your diagrams really help showcase the diversity of things an orchestrator needs to manage, though I would suggest even hammering home the idea by even putting edge commuting and mainframes as realistic additional items for many orgs! 😅 I am also curious about the focus on automation. IME platform orchestrators strive for automation, but as the CNCF maturity model showcases, automation may either not be necessary or not be available or even desirable in all scenarios. So while “enforcing consistency” of not only infra, but of also process, compliance, sign-off, etc I have found a need to manage legacy company process as much as legacy infrastructure. Have you also seen this?
I also have a few ideas on your questions @Ivo this too that are maybe helpful: 1. Very few. I work for Syntasso and we are building Kratix.io which is a FOSS orchestrator and actually one of the examples on the original Thoughtworks Tech Radar Blip. In addition to our product, I think that Crossplane often gets raised as an option here. I am a big fan of the reconcile loops with infra as code! My experience is that the ability to loop in older tech and processes (think
scp
style commands or manual approvals) requires a fairly high level of skill to write your own controllers for each integration. Both Kratix and Crossplane have an enterprise option also. For Kratix, even that is self hosted as what we provide is a composable framework rather than a single solution, so our customers (so far) prefer the security and flexibility of self hosting (including in air gapped environments!). 2. Platform teams. I will definitely unequivocally say platform teams. Platform teams own the interfaces between capabilities and users. a request for something comes in, how does it get delivered? Platform orchestrator. Platform teams very often have a whole heck of a lot of other responsibilities including capability implementation (think Terraform, Crossplane, Ansible, etc) and interface creation (think Backstage, CLIs, etc). But the orchestration is is squarely in that platform bucket. 3. You use the term IDP here, and I could read that as portal or platform. In regards to
portal
, I think that it is frequent platforms will provide some number of interfaces including possibly a portal, but part of the power of a platform orchestrator is that it can and should support any number of interfaces as it should expose all capability via APIs. In regards to
platform
, I think that platform and platform orchestrator can and should be interchangeable. Except many IDPlatforms today can’t support enough of a companies estate so they try to divide them. 4. I am interested in your perspective on my thoughts on #3 as I think this is tightly coupled. 5. Giving any implementation straight to a user has two major impacts: 1) it requires them to research and deeply understand the implementation and not just the use of it. And (quite possibly more importantly) 2) it reduces the flexibility to change implementation as needed. While Terraform is great, so was CFEngine before Puppet and Ansible took over. It is natural for technology to improve and we want to be able to use the right tool for the job which requires an interface between things. That being said, creating a thick layer that requires learning some crazy intense proprietary ideas is also hard. So it is about drawing that interface and contract in the right way which, to be fair, is the tricky part! 6. I am not sure what you mean by environment. If you mean test env (e.g. dev, staging, prod) I think it should span all of them, though you may choose to de-risk through blast radius instances etc). If you mean environment in regards to like company, then I think early stage companies can outsource their processes and DevEx to a 3rd party, but IME a company of sufficient scale / complexity ends up needing to customise 3rd party tools and they need a way to do this better.
a
You're getting very close now to what I'm working on creating internally @Romaric Philogène . If you remove for a moment the phrase "platform" and look for API Orchestration vs. Choreography. Now the left side of your drawing, is basically almost any CICD platform. The right side , is basically an API Gateway, and the communication to the different components, is basically a mimic of the "terraform provider" idea. We are basically reinventing the wheel.
a
Interesting Arie! I do think this pattern is getting much more popular so it's exciting to hear your working that way too. I'm curious, are you managing it with tf providers or was that just your analogy?
a
Thats an analogy mostly. If anything I've learned from using and training people to use Terraform, is that i dont like the abstraction if provides, as it sometimes used as an excuse.
a
Yea I was curious if that would work 😅 in particular because tf is (mainly) declarative code and I find these systems require some imperative actions too that don't fit well in that paradigm
a
I much rather abstract a direct call to the API layer of the cloud vendor / tool with just a parameters/config file
ex. If i use TF and create a file with tfvars values i then ingest in my terrafrom code, I can use exactly the same tfvars file in PowerShell script that uses for example the AZ CLI to create the resources (yes i have to deal with inter-depedency between resources on create and remove). The problem starts with us no owning what the provider is doing or even its existence or support through its lifecycle
you can always take the imperative out of tf and into your pipeline. Its what i usually do with config values that need to happen after provisioning and do not belong in a state file
a
Do you keep your pipelines colocated with the tf? I worry about the sustainability of code changes that span many repos. For analogies sake, the value of having db migrations in the same repo as the software code. Helps make more atomic changes.
a
Usually no. I had dev teams in the past that wanted full ownership of the infra, and then they got their TF code in the same repo to maintain and get what you refer to as "version consistency". Quite a few devs dont want to mess with TF code or maintain it, and there its simpler to have a config file , so creating your agreed upon DSL, and have a second pipeline (even from a second repo or global repo) that runs the infra changes based on the config file. I treat the infra as if it was a package that you have a dependency on, that you also indicate to the build process. Think of it as if it was a npm package called infra and your package.json have a pinned version to it, which then makes the "version consistency" less of a problem.
An exception would be your example of db migrations that have to be tied to same git commit / tag / build artifact
For me, using TF means git is no longer your source of truth, its actually the state file. It can become your source of truth if you use GitOps and infra managment is controlled via it.
a
Yup makes a lot of sense.
Then the challenge is the upgrading of the pipelines people use across the org. But I know with gitlab you can reference shared pipelines and things so def doable.
a
All the current pipeline platforms allow you to create shared pipelines/templates to make it reusable. Some also offer a way to enforce steps/actions to create governance so devs dont decide on thier own to "remark the package scanning step to make it go "faster" or remove "blocks" ".
a
That's very cool. And definitely sounds aligned with the patterns we're building into Kratix based on our experiences. Glad to hear it's working for you!
a
Also @Romaric Philogène don't stop at infra. On the right side on my architecture is every single tool you can think of. From pure SDLC tools to even external systems like IAM, Storage, ITIL tools, heck even the UI element of some incident or service tools can have an API to allow you full control of a service by its team, end to end
i
Very interesting points! These Slack threads are not ideal for such extensive discussion (because they eventually get lost like Romaric said) but it's what we have. @Abby Bangser Many thanks for you answers to my questions. 1. It's interesting that there are so few open-source options. Surely if this was such a glaring omission in the platform engineering langscape, more options would be available perhaps? 2. I agree that platform teams are responsible for providing interfaces to users, so that they can consume the resources it offers. Not entirely convinced yet that a full-blown orchestrator is needed. 3. I was indeed referring to an internal developer platform (IDP) team. What do you mean by interchangeable? I thought the orchestrator is an essential building block to build the platform, not a replacement for it? 4. I personally don't think you need an orchestrator. @Arie Heinrich makes some very interesting points regarding the use of Terraform. I'm not sure if I understand totally what he is building but I have also split the TF code (which is owned and maintained by the platform team) from the interface (tfvars files). What I'm saying is that you can provide a very simple interface to devs via tfvars files that they can modify (the HCL syntax is not complicated), which then trigger platform changes via CI/CD pipelines. Devs never have to look at the TF code. In my view TF is actually an orchestrator as it is able to configure basically everything. Like Arie says, I'm pulling this way up to basically also have config-as-code, which allows me to glue everything together for the devs. 5. As I mentioned in my previous point, you don't need/want to expose the TF code directly but merely the interface with which you call that TF code via automation. This allows the interface to evolve seperately from the implementation, albeit within the TF/HCL restriction. 6. I meant a company environment indeed. If you don't have to cater for every use-case/tool, do you really need to develop/buy a complex piece of software to build your platform, given the above points? I have to be succinct here, maybe it's better put in a blog, hope it makes sense. Interesting discussion, thanks!
a
This isnt necessarily related to TF itself but actually to the design of a terraform provider. It is an abstraction layer on top of an API. And since everything today is an API, nothing stops you from creating a provider for every single tool available. Heck there is a provider of Minecraft and one for ordering pizza from domino. What you need is a way to aggregate all these "providers" into an API Gateway that will take care of things like authentication, authorization, rate limiting and routing to the correct api provider. You add to that documentation and discovery (basically a service catalog).
And you don't even need to provide the devs any tfvars files, when you think about it, its just configuration values in some configuration database. Whether it is in a real database or saved as a json file is not relevant as long as your pipeline engine knows how to read it and extract the values. The only thing you have to remember is that any type of abstraction you place on top of a tech means the user has no idea how it works, how to debug, how to fix. If you have more than one tool across the company, it also makes it hard to pass knowledge between teams or when team members move between teams, they don't retain the knowledge of how to do things. And last is that when you recruit someone it is unlikely they had something similar, so expectations and learning curves.
i
That's true, in this case tfvars are the most direct path but it could be something else entirely. The devs don't seem to have a problem writing/adjusting the configs, as long as the docs are there.
a
As long as we dont make git a database. There config values that are static and ones that is very dynamic. The separation between values needed for provisioning vs configuration.
a
This is an awesome discussion! I think we are all circling around the same ideas and as in any tech, there are some different implementations.
IMHO, if you're creating a generic value files that could feed into a tf provider or a pipeline or any other technology you might want to implement the back end with, then you have an API. As Arie said, CRUD activities for that config file could end go through an API gateway so you could get those features (security, scaling, observability). Then, again IMO, whatever is taking that config file and managing it's execution, whether it be terraform, helm, a ci/cd pipeline etc is your orchestrator. The term is new, doesn't mean the concept is. The idea of "orchestrating" where and how to fulfill a platform request is something we have always done. Just sometimes it manual (via tickets etc) of via tools using different terms in their marketing (e.g. your use of tf)
i
Agreed! I guess the confusion for me started when it was positioned (and marketed) seperately as some sort of (necessary) new component to build platforms. Different ways to go about it indeed, all depending on context, as always.
a
Nahhhh it is always about needing the capabilities, not any one specific tool.
In particular when you start talking to teams that are a bit more advanced in their processes, it can get super confusing because you have generated the kind of outcomes that these tools (including mine) are offering to help people with. And in time, these tools will grow up to the point where you see them as a commodity and you want to offload some of your custom internal glue. But until you see a 10x improvement, it is all just marketing noise and confusion. Many (MANY) teams aren’t yet at decoupling the vars from the code and not yet able to manage the resulting infra as a fleet due to how the creation process spawns outputs. In those cases, a new tool that is specialised in those capabilities can help.