Hi all! Looking for inspiration / ideas! I'm des...
# gitops
j
Hi all! Looking for inspiration / ideas! I'm designing a k8 IDP, likely with ArgoCD. Assuming we have 3 environments (
dev
,
qa
,
prod
), how have you managed to stick to GitOps principles whilst deploying to different environments through an MR/PR? For example... This bit is clear: • Branch created = ephemeral namespace for deployment in
dev
• MR/PR created = "deploy" to the services main
dev
namespace (in
dev
) • MR/PR merged to default branch = "deploy" to the services main
live
namespace (in
live
) This bit is unclear: • Once deployed to
dev
and all CI checks have passed etc, how do you manage the promotion process from
dev
to
qa
(and even a
preProd
if that was an option)? Thanks in advance! (plus any other ideas / methods welcome)
k
We currently use a “branching” strategy. Once a component passes CI, it merges a change to another repo where we hold an Argo app of apps. The app of apps has an app representing each environment, which have their
target
set to a corresponding repo branch which represents the environment. The mainline of the repo represents the next deployment candidate. Promoting a candidate to staging means creating a PR on staging from our mainline to the staging branch. When that’s merged, the Argo app watching that branch sees the change and syncs. This process repeats for production.
Thus if I want to know what’s in production by looking at git, I look at the production app in the production branch.
c
We use a setup with a seperate (from business logic code) GitOps repo, using one master branch, and subfolders for various environments. Components have a helm chart, and each environment contains applicable custom values. CI/CD is just merge requests applying the necessary patches to files in that folder's environment. In certain cases, this is even automated resulting in an automated git commit. We like the simplicity of just sticking with one main branch, which always contains a complete, single source of truth for every environment. We tried a strategy with multiple branches, but it became a bother for us when combining with feature flagging developments from the (seperate) development/business code repo, as any branching used there was hard to map to the gitops repo.
r
Hi guys, out of curiosity - any tutorial that you would recommend to cover what you’ve done with ArgoCD for ephemeral envs?
k
@Corstijan Kortsmit our of curiosity, do you have any regrets or pain points after moving to a directory pattern? We debated both ways before settling on a branching strategy. One reason we went with branches was not to pollute the mainline history with things like deployments and rollbacks, which would make it harder to figure out when components were introduced. That being said, sometimes we feel like we should have gone the other way. In the case of ephemeral envs which run on a different cluster than production, we do use a directory structure instead of branching, but then we can end up with merge conflicts when things get busy. @Romaric Philogène I’m not sure if there’s a single tutorial that covers that to my knowledge.
c
@Kyle Campbell The main branch does get a lot of commits/activity. But that's what the git tooling is good (enough) at in our experience. If anything, the fact all activity is 'serial', and not branched, only makes it easier to use in our experience. Couple of points for this though; The GitOps repo is seperate from regular code development, and all commits are from merge requests that follow a standard commit message. We use helm charts as the ..intermediate; So code repos have CI pipelines that result in a helm chart. Then that pipeline commits a change to the gitops repo, triggering deployment, and setting up merge requests to merge to any other environments. There's standard texts, version numbers, environments in the commit message. This makes it easy to just git log and grep away. If you're mixing in 'regular' development work in the repo where your gitops code is as well, I can see how it might get messy.
j
Thanks everyone! We're going to do a deepdive this week, will feed back anything that's useful 🙂
j
This has been a really useful thread to read through. Thanks. If I may ask a question that has been bothering me as we plan out something similar. When the source code repo pipeline passes CI and it needs to update the GitOps repo. How do you lock that down? For context we use Azure DevOps and I'm struggling to plan an approach that adequately secures the GitOps repo. Sure I could protect the main branch int the GitOps repo with a PR policy but then how do I trust the CI Repo to be able to create and auto merge a PR. I don't think there is anyway I can have PR automation that trusts the author identity of the PR...
k
In our case, the CI workflow doesn't create a PR directly, it triggers an Action on the gitops repo, and a workflow in the ops repo creates and merges the PR. The version bump is carried as a param on the workflow, so the PR is constrained to only version bumps. This is the part where you have to make a choice. If you automerge, you do introduce the possibility of somebody trying to inject malicious things in the version param by poking the action, though if you write your action well all they can do is change to a valid semantic version for a component. We chose to accept that risk as monitoring change logs we feel it's sufficient. Depending on your risk tolerance and change management policies, you may want to not use automerge and instead use a manual approver. https://mattermost.com/blog/how-to-use-github-actions-securely/
c
^ this -> Security and locking things is always choice; I would consider a few things: • limit access to the gitops repo as much as possible. Use non-personal accounts, automaticly set up roulation for keys/passwords, etc • Look at policies and admission controllers for your kubernetes cluster. Depending on the setup of the parameters you're changing; ie; if the whole image label is a parameter, set up a policy that disallows images from other, unvetted locations (this is good practice in general, I suppose). Now that I think on it, this does imply that things like cluster creation and policy as code is a different repo. Which is the case in our setup, since we do a lot of multi-tenant stuff anyway. • Also, you could introduce a delay between the push to the gitops repo and the actual argo sync, or use a sync window. That could allow some time for a security tool to pick up a suspect push and alert an operator or your SOC. Though that's really pushing things I guess.
j
Thanks both for your input. Very helpful!
a
We are also trying to rollout GitOps with ArgoCD in my organization. I am also struggling with this decision. My specific question is - what do you keep in the service repository and what do you keep in gitops repository? Do you keep any part of helm charts in service repository (may be base helm charts) ? Or its just the source code that you keep in service repo and all helm configuration in gitops repo?
k
In our case, the answer is a 3rd repo. We have one for the service source, one that is all our helm charts (for all services) and one that is the argocd apps and overlays
The hand off (and shared responsibility) is thus the helm chart.
a
Thanks, @Kyle Campbell. Can you please also explain how you promote changes from one environment to the next environment? Do you follow continuous delivery practices where changes are promoted from lower environments to higher ones in an automated fashion? If yes, what other tools do you use to make it all work together?
k
That’s discussed farther up in the thread (see the whole discussion about branching versus directories), but the concepts the same. The only tools involved here are Github actions and Argo CD in our case. At the end of our test run, the Action commits a change to the Argo repo, which triggers the whole promotion process. What we’re using for the actual testing is irrelevant, you only need to know pass/fail in the action