aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Software Engineering

Kubernetes Best Practice: How To (Correctly) Set Resource Requests And Limits

  • aster.cloud
  • October 24, 2022
  • 5 minute read

One of my biggest pet peeves when managing Kubernetes is when there are workloads with no resource requests and limits. I was so frustrated by this that I created Goldilocks, an open source project, to make the process of setting initial resource requests and limits easier. In this blog, I’ll talk about Kubernetes best practices for correctly setting resource requests and limits.  

Kubernetes is a dynamic system that automatically adapts to your workload’s resource utilization. Kubernetes has two levels of scaling. Each individual Kubernetes deployment can be scaled automatically using a Horizontal Pod Autoscaler (HPA), while the cluster at large is scaled using Cluster Autoscaler. HPAs monitor a target metric of individual pods within a deployment (often CPU or memory usage), and they add or remove pods as necessary to keep that metric near a specified target. Cluster Autoscaler, meanwhile, handles scaling of the cluster itself. It watches for pods that cannot be scheduled and adds or removes nodes to the cluster to accommodate those pods.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

A key feature of Kubernetes that enables both of these scaling actions is the capability to set specific resource requests and limits on your workloads. By setting sensible limits and requests on how much CPU and memory each pod uses, you can maximize the utilization of your infrastructure while ensuring smooth application performance. To maximize the efficient utilization of your Kubernetes cluster, it is critical to set resource limits and requests correctly. Setting your limits too low on an application will cause problems. For example, if your memory limits are too low, Kubernetes is bound to kill your application for violating its limits. Meanwhile, if you set your limits too high, you’re inherently wasting resources by overallocating, which means you’ll end up with a higher bill.

While Kubernetes best practices dictate that you should always set resource limits and requests on your workloads, it is not always easy to know what values to use for each application. As a result, some teams never set requests or limits at all, while others set them too high during initial testing and then never course correct. The key to ensuring scaling actions work properly is to dial in your resource limits and requests on each workload so that workloads run efficiently.

Read More  5 Ways Platform Engineers Can Help Developers Create Winning APIs

Setting resource limits and requests is key to operating applications on Kubernetes clusters as efficiently and reliably as possible.

How to Set Kubernetes Resources

The open source project, Goldilocks, by Fairwinds helps teams allocate resources to their Kubernetes deployments and get those resource calibrations just right. Goldilocks is a Kubernetes controller that collects data about running pods and provides recommendations on how to set resource requests and limits. It can help organizations understand resource use, resource costs, and best practices around efficiency. Goldilocks employs the Kubernetes Vertical Pod Autoscaler (VPA). It takes into account the historical memory and CPU usage of your workloads, along with the current resource usage of your pods, in order to recommend how to set your resource requests and limits. (While the VPA can actually set limits for you, it is often best to use the VPA engine only to provide recommendations.) Essentially, the tool creates a VPA for each deployment in a namespace and then queries that VPA for information.

To view these recommendations, you would have to use kubectl to query every VPA object, which could quickly become tedious for medium-to-large deployments. That’s where the dashboard comes in. Once your VPAs are in place, recommendations will appear in the Goldilocks dashboard.

The Dashboard presents two types of recommendations depending on the quality of service (QoS) class you desire for your deployments:

  1. Guaranteed, which means the application will be granted higher priority over other workloads in order to guarantee available resources. In this class, you set your resource requests and limits to exactly the same values, which guarantees that the resources requested by the container will be available to it when it gets scheduled. This QoS class generally lends itself well to the most stable Kubernetes clusters.
  2. Burstable, which means the application will be guaranteed a minimum level of resources but will receive more if and when available. Essentially, your resource requests are lower than your limits. The scheduler will use the request to place the pod on a node, but then the pod can use more resources up to the limit before it’s killed or throttled. This QoS class is granted a lower priority when deciding which workloads to remove when resources are lacking.
Read More  Kubernetes 1.19: The Future Of Traffic Ingress And Routing

The dashboard provides recommendations for both the Guaranteed and Burstable QoS classes. In the Guaranteed class, we recommend setting your requests and limits to the VPA “target” field.

Note: a third QoS class, BestEffort, means that no requests or limits are set and that the application will be allocated resources only when all other requests are met. Use of BestEffort is not recommended.

Specializing Instance Groups for Your Cluster

If you are interested in fine-tuning the instances that your workloads run on, you can use different instance group types and node labels to steer workloads onto specific instance types.

Different business systems often have different-sized resource needs, along with specialized hardware requirements (such as GPUs). The concept of node labels in Kubernetes allows you to put labels onto all of your various nodes. Pods, meanwhile, can be configured to use specific “nodeSelectors” set to match specific node labels, which decide which nodes a pod can be scheduled onto. By utilizing instance groups of different instance types with appropriate labeling, you can mix and match the underlying hardware available from your cloud provider of choice with your workloads in Kubernetes.

If you have different-sized workloads with different requirements, it can make sense strategically and economically to place those workloads on different instance types and use labels to steer your workloads onto those different instance types.

Spot instances tie into this idea. Most organizations are familiar with paying for instances on demand or on reserved terms over fixed durations. However, if you have workloads that can be interrupted, you may want to consider using spot instances. These instance types allow you to make use of the cloud provider’s leftover capacity at a significant discount—all at the risk of your instance being terminated when the demand for regular on-demand instances rises.

Read More  New Relic Launches Observability Add-On For Amazon Elastic Kubernetes Service (EKS)

If the risk of random instance termination is something that some of your business workloads can tolerate, you can use the same concept of node labeling to specifically schedule those workloads onto these types of instance groups and gain substantial savings.

How to Enable Kubernetes Resource Recommendations

Goldilocks is one of the tools Fairwinds Insights deploys to provide workload efficiency and performance optimizations. With Fairwinds Insights, Goldilocks can be deployed across multiple clusters so information is available to teams in a single pane of glass. Fairwinds Insights adds data and recommendations to Goldilocks, including potential cost savings. The dashboard that appears includes a list of namespaces and deployments with average total cost and cost recommendations.

Many organizations set their CPU and memory requests and limits too high, so when they apply the recommendations from Fairwinds Insights, they are able to put more pods on fewer Kubernetes worker nodes. When Cluster Autoscaler is enabled, any extra nodes are removed when they are unused, which saves time and money.

Using software like Fairwinds Insights or open source tools like Goldilocks empowers developers to remove the guesswork by automating the recommendation process for them. In turn, it opens the door for you to increase the efficiency of your cluster and reduce your cloud spend.

 

 

Guest post originally published on Fairwinds’s blog by Andy Suderman, Lead R&D engineer at Fairwinds
Source CNCF


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Fairwinds
  • Fairwinds Insights
  • Goldilocks
  • Kubernetes
  • VPA
You May Also Like
View Post
  • Software Engineering

Embedded Swift Improvements Coming in Swift 6.3

  • November 22, 2025
Visual Studio Code
View Post
  • Software Engineering

Visual Studio 2026 is here: faster, smarter, and a hit with early adopters

  • November 12, 2025
View Post
  • Software Engineering

Introducing Google Gen AI .NET SDK

  • October 24, 2025
View Post
  • Software Engineering

Julia 1.12 Highlights

  • October 13, 2025
View Post
  • Engineering
  • Software Engineering

Development gets better with Age

  • October 9, 2025
View Post
  • Software Engineering

The Growth of the Swift Server Ecosystem

  • September 27, 2025
men with computer website information and chat bubbles vector illustration
View Post
  • Software
  • Software Engineering

What is an ISV (independent software vendor)?

  • August 27, 2025
aster-cloud-erp-bill_of_materials_2
View Post
  • Software
  • Software Engineering

What is an SBOM (software bill of materials)?

  • July 2, 2025

Stay Connected!
LATEST
  • 1
    Expectations vs. Reality: The AI We Thought We’d Have in 10 Years
    • June 19, 2026
  • digital-nomad-freelancer-worker-2151205464 2
    One paperwork problem – Get your Digital Nomad Visa employment documents fast from UK, EU or Singapore
    • June 16, 2026
  • 3
    Samsung Art Store Brings Art Basel to Homes Worldwide With New Curated Collection
    • June 15, 2026
  • 4
    You Do Not Need to Invest in the IPO of SpaceX, Anthropic, and OpenAI
    • June 10, 2026
  • 5
    The consequences of relying on AI for accurate news
    • June 10, 2026
  • 6
    Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers
    • June 10, 2026
  • 7
    WWDC26: Apple unveils next generation of Apple Intelligence, Siri AI, powerful parental controls, and an expansive set of software improvements
    • June 8, 2026
  • 8
    IBM and Google Cloud Announce Strategic Partnership to Scale AI with Human Expertise and AI‑Powered Delivery
    • June 4, 2026
  • Data center 9
    Data Sovereignty in Spain. It’s Not Just About the Law, It’s About Efficiency
    • June 3, 2026
  • 10
    Ink vs Pixels. What you miss versus what you are actually missing.
    • June 1, 2026
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Banks race to patch new cyber vulnerabilities, and other cybersecurity news
    • May 25, 2026
  • pope-leo-xiv-cq5dam-1500.844 2
    Pope Leo XIV to Publish First Encyclical on Artificial Intelligence and Human Dignity on 25 May
    • May 22, 2026
  • 3
    Portfolio to Clients, and is Strengthened by Ongoing Project Glasswing Work
    • May 20, 2026
  • reMarkable Paper Pure 4
    Everything The reMarkable Paper Pure Actually Does
    • May 14, 2026
  • 5
    Scaling cloud and AI: Microsoft Azure’s commitment to Europe’s digital future
    • May 11, 2026
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.