https://platformengineering.org logo
#kubernetes
Title
# kubernetes
d

Daniel Pauler

08/25/2022, 9:01 AM
Hello! 👋 I have a follow up question for @Natan Yellin fantastic talk regarding CPU limits: My environment features mainly (and sadly) some not-so-cloud-native, monolithic applications that were squeezed in a container. That means, I have workloads that have requests like 32CPUs... Currently we have big issues with performance and throttling of CPU. I have the feeling that this has something to do with the CPU limits and so I want to try it without setting them. 👍 Now I try to wrap my head around what could be the worst case scenario when I spawn my pods without the limits: Lets assume all pods start up and try to take everything they can get. Can they "steal" each others CPU? I think not, because I set requests, so each pod will get its part of the CPU time, and they can only "fight" for the spare time? What is with the node OS itself? AFAIK K8s never can use up all resources as there is some mechanism that will always have some "resources" for the node itself?
n

Natan Yellin

08/25/2022, 9:09 AM
@Daniel Pauler that’s totally correct. A few points to add: 1. Regarding CPU for the node itself, it’s reserved via node allocatable - see https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/ 2. Make sure you remember that advice to not set limits is only relevant for cpu. For memory you need limits and you should set request=limit! (Just wrote a blog about this here - and @Luca Galanteand I are planning another webinar just about memory limits) 3. For high performance, sometimes its worth considering cpu pinning and or dedicated nodes - see here
@Daniel Pauler regarding those legacy not-so-cloud native applications, Kelsey Hightower had a humorous take you might enjoy (don’t get depressed though, it’s 2022 now and people run those workloads on kubernetes all the time)

https://youtu.be/HlAXp0-M6SY?t=812

d

Daniel Pauler

08/25/2022, 11:59 AM
Thank you very much for the awesome resources! 🤩 I will check them out and let you know how my little experiment went! 😄
f

Felipe Schossler

08/25/2022, 3:07 PM
Big +1 in item 3 that @Natan Yellin mentioned. This a thing that we gonna implement here in my company. In this way the competition of resources like you said before never happens with another workloads. I think that is the most quick-win solution 🙏
d

Daniel Pauler

08/29/2022, 1:27 PM
@Felipe Schossler Thank you for you input! I wonder, how you implement the CPU pinning without setting limits? IMHO you need to set the limits in order to get a pod assigned to the
Guaranteed
QoS class - and that is the only one QoS that will use the CPU manager. Did you simple disable the CFS quota via Kubelet´s
--cpu-cfs-quota=false
flag in order to make that happen?
f

Felipe Schossler

08/30/2022, 2:11 PM
@Daniel Pauler I don't care so much about CPU pinning because like I said before, I just gonna put these CPU intensive workloads in a separate node pool, the throttling (if it happens) will happen directly from the OS and in this case, I don't have anything to do about beside increase the vertically this node pool. QoS class is another thing that don't make anything so useful in my case because I separated this workloads from the rest. You got my point? 🙏
d

Daniel Pauler

09/09/2022, 11:44 AM
Just as a short update: We have carefully removed the CPU limits from our main workload - and we got really surprised: We took some actions with node resource reservation and also ensured that all other workloads set CPU requests (at least, the important ones). We also updated our monitoring alerts so we get notified in case something gets odd. So far, nothing blew up! 🤞 Regarding our main workload -> We no longer have any throttled containers and the overall performance of our application increased by 25-50% (Peak 70%). 🥳 We now continue to tweak resource allocation and maybe also try out the CPU manager - but for now, we are pretty happy! 😁
f

Felipe Schossler

09/09/2022, 7:10 PM
Awesome, well done job (mainly creating the necessary observability before take any actions) 👏 In my company the performance increased something between 50% of the workloads that we removed CPU Limits too 🔥
We took some actions with node resource reservation
What actions did you take about this @Daniel Pauler? I ask you about this because we didn't take any actions about this 👀
d

Daniel Pauler

09/12/2022, 7:02 AM
Wow, that is so nice! 🤩 We basically had a look at https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/ and doublechecked that
--kube-reserved
and
--system-reserved
are set and the values kinda make sense for our setup.
n

Natan Yellin

09/12/2022, 3:06 PM
Guys, can I share a screenshot of this on LinkedIn and Twitter for all the people who don't believe me?
@Daniel Pauler I’d honestly just like to show people your post so that they can see its also useful advice in the real world
d

Daniel Pauler

09/13/2022, 8:14 AM
Sure, go for it! 😄 In addition to your great content we also found this article quite illustrative: https://medium.com/directeam/kubernetes-resources-under-the-hood-part-3-6ee7d6015965
n

Natan Yellin

09/13/2022, 4:09 PM
Yeah, it’s a great article! I had the pleasure of hearing Shon (the author) talk in person a few months ago
f

Felipe Schossler

09/14/2022, 1:01 PM
Of course @Natan Yellin, from my part there is no problem 🚀
n

Natan Yellin

09/14/2022, 1:06 PM
Thanks 🙏
28 Views