Hey, does anyone has reasonable solutions that wor...
# platform-toolbox
v
Hey, does anyone has reasonable solutions that work with AWS ECS? My company uses it a lot, and I wanted to make its management easy for developers. The issue is that I don't want them to learn Terraform or use the AWS console.
k
CDK?
a
I mean, my company still uses it, though we are trying to move everyone off of it. There is a centralized infrastructure platform or sre team that manages the large clusters. Each team has a cloud formation stack that creates the infrastructure needed (e.g. Load balancer and dns) and then they update the task definition each time they want to deploy a new container image . There is a cli but I highly recommend not using something like that. It's extremely hard for users to understand and debug anytime something goes wrong
v
@Kenneth Mroz I think it would be high effort and low value. The CDK doesn't abstract ECS clusters, services, load balancers, listener rules, etc. The platform team would still need to write code to reduce the cognitive load on the developers. Also, everything I'd do with the CDK I could achieve building Terraform modules. I wonder if I could make highly opinionated and simple modules and ask them to edit the
.tfvars
files.
@Asaf Erlich any specific reason for moving off it? The container image is changed in a CD pipeline. I was thinking about giving them simple YAML files that have the values for any IaC tool to use. Something like
public = true
would create the DNS record, register the load balancer rules, etc. Also, I don't like CloudFormation so I am running away from it.
k
@Vitor Costa that would work but then I would just say have your developers learn TF. It’s not a complex tool to learn and there is a lot of more benefit to let the developers create their own systems while you can have your team manage guard rails around the clusters.
r
First, disclosure: I’m internal evangelist for Amazon at AWS but these opinions strictly my own. The goal you have is awesome: enable developers to use containers on AWS without having to manage the stuff that enables. ECS is a good way to get that done. That said, to create a scalable org/solution do believe that developers should manage the stuff is specific to their service – think task definition, storage, database, orchestration. Here’s how I’d recommend getting to your vision: Create a landing zone – your developers need deploy ECS into a landing zone that provides standardized security (data handling in flight/at-rest, DDOS, etc), networking (VPC, VPC endpoints, service to service communication), logging/metrics/tracing services (CW logs and alarms & x-ray) and there’s a good understanding of what the landing zone does and the what developers have to handle. Create a CDK (in JavaScript) golden path for your main workload types – to model what good looks like for your groups, have a library your developers can use to get this done from scratch. While CDK isn’t perfect (because it sits on top of CFN), it’s way, way better than any solution that calls directly into the SDK and any fixes/improvements to CDK come to JavaScript first. Key elements to include: • Default to using the ECS Fargate launch type instead of the EC2 launch type – yes, Fargate is more expensive but it saves so much operational effort it’s worth it. Controversial: if I needed to run a latency sensitive, resource intensive, or GPU workload, it’s less complex to run ASG, EC2 instances, and ImageBuilder than ECS using the EC2 launch type. • include ave common service end-points – ECR, S3, DynamoDB/Aurora • Don’t include a NAT or SSH access – because we are professionals 😊 • Assume it sits on top of the landing zone Use CDK to script workload specific resources – developers should use some sort of IaC to create the resources that are specific to their microservice/domain. Enabling developers to create resources in the console is naughty.
a
@Asaf Erlich any specific reason for moving off it? The container image is changed in a CD pipeline. I was thinking about giving them simple YAML files that have the values for any IaC tool to use. Something like public = true would create the DNS record, register the load balancer rules, etc. Also, I don't like CloudFormation so I am running away from it.
We are moving everyone to EKS / Kubernetes. There are a lot of reasons for this choice: 1. Much more public documentation for Kubernetes over ECS. 2. A developer is much more likely to have used it before and be familiar with how to use it on day 1. 3. As you hinted asking new engineers to run kubectl apply is less overhead than teaching them terraform or cloud formation. 4. There is a much bigger ecosystem around Kubernetes. Everything from services automatically creating load balancers to service mesh solutions. 5. Kubernetes is designed to be more extensible. You can create plugins on top of kubectl to abstract things like auth and cluster discovery. There exist custom resources to manage aws resource creation for people. 6. Open source tools like Argo CD exist to manage gitops deployment for users. There's just a much bigger menu of open source solutions off the shelf to build on top of.
Those are just a few things off the top of my head, but obviously you have to live with what you're currently using. Migration is very hard.
v
@Asaf Erlich In the last few days, I have also been thinking about proposing this change. It would add an overhead because Kubernetes management is complex, and our applications are pretty simple. I think ECS is adding more complexity than Kubernetes would because it doesn't have this unique ecosystem and its extensibility. When I look for tools to automate deployment creation and management aiming to reduce cognitive load, I find lots for Kubernetes. Thanks for the insight. @Ric McLaughlin Amazing tips. I will do something like that with some changes to adapt to my company's environment. I am not sure yet about using a CDK, even though I know it's amazing, but so is Terraform. I will need to do some research.
r
Thanks @Vitor Costa -on a couple of your points:
The CDK doesn’t abstract ECS clusters, services, load balancers, listener rules, etc. The platform team would still need to write code to reduce the cognitive load on the developers. Also, everything I’d do with the CDK I could achieve building Terraform modules
Using CDK you definitely could extend the L2 ECS constructs to do this and it’s common practice - really like this blog article on how to get that done: https://aws.amazon.com/blogs/containers/general-availability-amazon-ecs-service-extensions-for-aws-cdk/
I am not sure yet about using a CDK, even though I know it’s amazing, but so is Terraform.
Agreed Terraform is great; that said, it’s a new language to learn and doesn’t enjoy the deployment safety features present in CFN/CDK. That doesn’t mean CFN is frustration free… 🙂
…Kubernetes management is complex, and our applications are pretty simple. I think ECS is adding more complexity than Kubernetes would because it doesn’t have this unique ecosystem and its extensibility.
💯 agree with this and it’s the key to developer productivity. High level I’d approach it like: Lambda is less complex than Fargate which is less complex than EC2 which is less complex than Kubernetes (EKS). And the less complex the environment is, the more developers are happy and productive.
h
@Vitor Costa, we have managed ECS resources in two different ways so far… at the beginning using Sceptre/CF (which was a bit of pain for devs), and then we’ve moved to TF modules. The Devs are not required to know TF at all… they only interface with some yaml files and all the complexity is managed in the backend. After 6 months, the feedbacks are very positive.
v
We have some tools on the line to test, but we're going to simplify a lot the ECS module and give
.tfvars
to the developers. As soon as it is stable, we can start thinking about Kubernetes and its ecosystem. I am also prioritising the tools that will help with this migration. Thanks, everyone, for the insights! Amazing community.
r
I think someone in my team was experimenting with aws copilot as a way to simplify the ECS experience for our developers https://catalog.us-east-1.prod.workshops.aws/workshops/bbe33a01-4c98-4cfc-b859-d55d699419b8/en-US
m
I have two different approaches depending on the customer: 1. Something similar to @Ric McLaughlin’s idea but with Pulumi and Typescript. The deployment code is stored in the same repo as the app but it is relatively short as I have a common library that is used by most of them. There is another repo with all the foundation infrastructure e.g. ECS clusters. 2. Simple YAML file committed in the application's repo. It is interpreted by DevOpsBox (disclaimer - I am a co-founder of it) which is using Pulumi to create all the infrastructure and deploy the app. btw. I have chosen to use ECS Fargate instead of EKS because for me it is much, much simpler. Link to my comparison of ECS Fargate and EKS from PlatformCon:

https://youtu.be/3xKONsYbaco

r
@Vitor Costa did you consider a platform like Qovery? The only downside from your requirement is that it runs on Kubernetes. But it’s maybe the level of abstraction you’re looking for?
n
The platform I helped to build is based on ECS. We provide deploy jobs to the developers. We operate the terraform to create the ECS clusters (using a mix of EC2 and Fargate), and loadbalancers on the front end. The deploy process kicks off a Python Lambda, which in turn create CloudFormation templates of the Service and Task Definitions, We use a deployer.py Lambda to make it easy for the Developer teams We use CloudFormation templates to enable easy rollback in the vent of deployment failure.
b
@Vitor Costa I'm late to the conversation, but it sounds like Nullstone is a good fit for what you're looking to do. Out-of-the-box, developers can launch a wide array of apps including ECS. If you or your platform team wants to tweak the infrastructure (e.g. for compliance, security, regulatory, etc.), you can fork the Terraform modules (developers still don't need to directly interact with Terraform files).
s
AWS wrote co-pilot, maybe does the work