Quota & Limits on Kubernetes

Venkat Kapisetti
1 min readAug 30, 2020

This is during 2017–18 and we have promoted an application to run in Openshift prod and it is running fine for a couple of months with no issues untill our ops guys got a pager.

At a very high level, the Openshift/Kubernetes nodes are getting into NotReady state and the api kept on alerting that node is down, overall we have drilled down the RAM pressure on the node is reaching 100% and thus the node is pushed to notready state.

We have started analyzing the app and found that, it is a memory leak in JVM causing the application to consume all the RAM on the node. In cloud native world we cannot have platform fail because of a rogue container.

Rogue Container/Pod: We cannot have a Rogue impacting a multi-tenant large cluster on it’s availablity and implementing the ResourceQuota at namespace on Kubernetes and LimitRange on Pod/container has solved the problem.

At a namespace level:

apiVersion: v1
items:
- apiVersion: v1
kind: ResourceQuota
metadata:
creationTimestamp: null
name: app1
spec:
hard:
limits.cpu: “8”
limits.memory: 16Gi
pods: “20”
requests.cpu: “6”
requests.memory: 10Gi
status: {}
kind: List
metadata: {}

At a container level:

apiVersion: v1
items:
- apiVersion: v1
kind: LimitRange
metadata:
creationTimestamp: null
name: resource-limits
spec:
limits:
— max:
cpu: “2”
memory: 4Gi
min:
cpu: 20m
memory: 10Mi
type: Pod
kind: List
metadata: {}

Not even a single resource, will be deployed into prod with a quota & limit and thus acheived. I have not discussed about, how we found the memory leak, let’s park that for next article.

--

--