Hello Everyone , My name is Shree, I have recently...
# kubernetes
s
Hello Everyone , My name is Shree, I have recently joined this group and hoping to get some great insight!! I have recently joined one organization and currently facing below challenge. I'm facing an architectural challenge with our infrastructure automation setup and looking for industry best practices. Current Setup: • We have AWX (Ansible Tower open-source) running inside our EKS Kubernetes cluster • This same AWX instance is responsible for provisioning, managing, and upgrading the very Kubernetes cluster it runs on (using Terraform/ Helm/Ansible playbooks) • We also host other internal tooling (SonarQube, GitHub runners) in this same cluster The Problem: This creates a circular dependency - AWX needs to be available to upgrade the cluster, but AWX itself is running on that cluster. If we need to make significant cluster changes or if something goes wrong during an upgrade, we risk taking down our management tool along with the cluster. Questions: 1. What's the recommended approach for hosting infrastructure automation tools like AWX? 2. Should infrastructure tooling always run outside the environments they manage? 3. How do others handle this chicken-and-egg problem with Kubernetes management? 4. What are the tradeoffs between a separate management cluster vs. external VMs for tools like AWX? We're trying to establish a more resilient architecture while balancing operational overhead. Any insights from those who've solved similar challenges would be greatly appreciated!
a
We're running AWX on Kubernetes also. Out of curiosity, why are you using Ansible to interact with Terraform? (we use other tooling for Terraform, which is used to manage the Kubernetes clusters)
Unless you're rolling your own Kubernetes clusters on-premise perhaps? Even then I'd imagine you'd only use Ansible to update components, rather than provision the VMs themselves?
s
Thanks for the question Andrew! I've recently joined the organization and inherited this setup, so I'm trying to understand and improve it. To answer your question - we're using EKS (not on-premises), and our current architecture has AWX running inside the same Kubernetes cluster that it manages. The setup uses Ansible with the
community.general.terraform
module to run Terraform for provisioning the very EKS cluster that hosts AWX. From what I understand, this approach evolved organically rather than by design. The organization didn't have a dedicated platform team previously, and this solution likely developed as teams needed to automate infrastructure while leveraging existing Ansible knowledge. The benefits of the current approach seem to be: • Unified tooling (everything through AWX) • Terraform variables can be passed from Ansible inventory • Single pane of glass for both infrastructure provisioning and configuration But as I mentioned in my original post, this creates the circular dependency problem that concerns me - AWX is responsible for managing the cluster it runs on. I'm genuinely curious what tools your team uses to manage Terraform outside of AWX/Ansible? And any recommendations for breaking this circular dependency while maintaining a manageable workflow would be greatly appreciated. We're looking to evolve our platform engineering practices, and learning from others' experiences would be extremely helpful.
a
We currently use GitHub Actions (which is not ideal) but there are lots of TACOs (Terraform Automation and Collaboration Software, some of which I've used or tried previously) • https://digger.devhttps://www.env0.comhttps://www.spacelift.io • Terraform Cloud • https://www.runatlantis.io Would personally recommend Atlantis as a good starting point
Running Terraform (declarative) inside of AWX/Ansible (imperative) seems like a strange mismatch of tooling conceptually? Plus as you'd identified it's always potentially problematic to have a tool update itself