Why does docker inspect show OOMKilled: false when the exit code is 137?

Because there are two OOM killers and only one sets the flag. Docker marks OOMKilled: true only when its cgroup memory limit triggered the kill. If you set no --memory limit, or the host itself ran out of memory, the host kernel's global OOM killer does the killing directly and Docker never sees the event — so it reports false even though memory was the cause. Check dmesg -T | grep -i oom for the truth.

How do I fix a Docker container that keeps getting OOMKilled?

Measure the real working set with docker stats under load, set a container memory limit a margin above that plateau, and then tell your runtime about the limit — --max-old-space-size for Node, worker sizing and MALLOC_ARENA_MAX for Python, -XX:MaxRAMPercentage for the JVM. Keep the runtime's heap ceiling at roughly 75–80% of the container limit so there's room for non-heap memory.

Why does raising the memory limit not stop exit code 137?

Two reasons. If your runtime is cgroup-unaware (common with Node and Python), it sizes its heap for the host's total RAM and never feels pressure from your limit, so a bigger number just delays the same crash. Or you have a genuine memory leak, in which case no limit is high enough — you've only made the crash slower and harder to diagnose.

Is exit code 137 always a memory problem?

Usually, but not always. 137 just means SIGKILL. The other senders are docker stop escalating from SIGTERM after its timeout, a failed health check in some orchestrators, or a manual kill -9. Rule those out by checking the container logs and whether a stop/health event preceded the kill — if none did, it's memory.

Docker Exit Code 137: Why It's OOMKilled (and Not)

Your container crashes. docker ps shows it exited with code 137. You check docker inspect expecting confirmation it ran out of memory — and it says OOMKilled: false. The host has 32 GB free. So what killed it? Exit code 137 is one of the most misread signals in Docker: it almost always means something sent your process SIGKILL, usually for memory — but the flag that's supposed to confirm it lies more often than it tells the truth. Here's what actually kills a container with 137, why your runtime never saw the limit coming, and how to stop it.

First time this bit me was in a high-frequency trading SaaS we shipped — container exited with 137 but OOMKilled showed false. I wasted an entire day chasing a flaky health check before realizing the host kernel had silently killed it.

TL;DR

Exit code 137 means SIGKILL — and for a container that dies on its own, it's almost always memory. The catch: OOMKilled: falsedoesn't mean it wasn't memory; it means the host kernelkilled it instead of Docker's cgroup. The repeat-offender cause is a cgroup-unaware runtime (Node, Python, Go) that sizes itself for the full host RAM, not the container limit. Fix: tell the runtime the truth — --max-old-space-size, worker sizing, GOMEMLIMIT — and keep the runtime ceiling at 75–80% of the container limit.

What exit code 137 actually means

Start with the number, because it's not arbitrary. When a Linux process is terminated by a signal, its exit code is 128 + signal number. Signal 9 is SIGKILL. So 128 + 9 = 137.

That's the entire meaning of 137: something sent your process SIGKILL.Not SIGTERM (the polite “please shut down” your app can catch and handle) — SIGKILL, the one no process can trap, ignore, or clean up after. The kernel just stops scheduling it and reclaims its pages.

SIGKILL has a short list of senders:

The kernel's OOM killer, when the system or a cgroup is out of memory.
Docker / your orchestrator, when docker stop hits its timeout and escalates from SIGTERM to SIGKILL.
A failed health check in some orchestrators, which kills and restarts the container.
A human or script running kill -9.

In practice, for a container that dies on its own, it's almost always memory. The other causes are real but rarer, and they're easy to rule out — which is the next step.

Why `docker inspect` says OOMKilled: false — the two different OOM killers

Here's the part that sends people in circles. You'd expect a memory kill to set the flag:

inspect.sh (read the real exit state)

docker inspect my-container \
  --format '{{.State.ExitCode}} {{.State.OOMKilled}} {{.State.FinishedAt}}'
# → 137 false 2026-05-27T18:42:11Z

Exit code 137, OOMKilled: false. Contradiction? No — there are two different OOM killers, and only one of them sets that flag.

Docker's cgroup OOM killer. If you set a container memory limit (--memory) and the container exceeds it, the cgroup memory controller triggers the kill and Docker records OOMKilled: true. This is the clean case.
The host kernel's global OOM killer. When the hostruns low on memory — or when a process is killed for a reason Docker didn't mediate — the Linux kernel picks a victim and SIGKILLs it directly. Docker never sees the cgroup event, so it reports OOMKilled: false even though the cause was absolutely memory.

Walk these four checks in order — the OOMKilled: false branch is where most engineers stop looking too early.

The takeaway: OOMKilled: falsedoes not mean “not a memory problem.” It usually means you either didn't set a container memory limit at all (so the host kernel did the killing instead of the cgroup), or the host itself is under memory pressure. The flag tells you which killer fired, not whether memory was the cause.

This single flag has probably cost my teams more debugging hours than any other Docker gotcha. I've seen senior engineers burn days assuming it wasn't memory-related, only to discover the host kernel OOM killer had stepped in because no container limit was set.

Tip

Always cross-check with the kernel log. dmesg -T | grep -i -E "killed process|oom" shows the OOM killer's victim, its PID, and how much memory it was using at the moment of death — the ground truth that docker inspectcan't give you when the host kernel did the killing.

The real cause: your runtime has no idea what the cgroup limit is

Set the memory limit, get OOMKilled: true, raise the limit, and it comes back anyway. At this point most people conclude they have a memory leak. Sometimes they do. More often, the runtime is sizing itself for a machine that doesn't exist.

A container is not a VM. It's a process tree with a cgroup memory limit wrapped around it. But most language runtimes, when they boot, ask the kernel how much memory the machine has — and the kernel answers with the host's total RAM, not the cgroup limit. The runtime then sizes its heap, buffer pools, and worker counts for that number.

A Node or Python runtime inside a 512 MB cgroup reads the host's 32 GB RAM, sizes its heap for 32 GB, grows past the limit, and is OOM-killed with exit code 137. — The runtime sizes itself for 32 GB it can't use, then walks straight past the 512 MB cap — the limit was never the bug.

A cgroup-unaware process reads 32 GB, decides it has plenty of headroom, and grows happily past your 512 MB limit. The cgroup controller kills it the moment it crosses the line. You raise the limit to 1 GB; the runtime still thinks it has 32 GB, still has no reason to run the garbage collector aggressively, and walks right past 1 GB too. The limit isn't the bug. The mismatch between what you capped and what the runtime believes is the bug.

How much memory your container thinks it has

This is where the fix has to be runtime-specific, because each runtime gets it wrong differently.

Node.js has a V8 heap with a default old-space ceiling that is not derived from your cgroup limit. Left alone, V8 will let the old space grow toward its default cap (gigabytes on a 64-bit build) regardless of the 512 MB container limit — so V8 never feels memory pressure, never runs a full GC in time, and the cgroup kills the process first. You have to tell V8 the truth:

Dockerfile (Node — cap the heap below the container limit)

# Container limited to 512MB → give V8 ~75-80% and leave room for
# non-heap memory (buffers, native addons, the runtime itself).
ENV NODE_OPTIONS="--max-old-space-size=384"

Python has no single heap knob, and two production traps that JVM-centric guides never mention. First, multiprocessing.cpu_count()(and anything built on it, like Gunicorn's “workers = 2 × cores + 1” rule) reads the host's core count — on a 32-core host your 512 MB container tries to fork 65 workers, each with its own interpreter and memory, and the cgroup kills it instantly. Size workers from the limit, not the host:

gunicorn_conf.py (Python — size workers to the container, not the host)

import os

CGROUP_V2 = "/sys/fs/cgroup/memory.max"
CGROUP_V1 = "/sys/fs/cgroup/memory/memory.limit_in_bytes"

# Values larger than this are effectively "unlimited"
UNLIMITED_THRESHOLD = 1 << 60  # ~1 exabyte


def cgroup_mem_limit_bytes():
    for path in (CGROUP_V2, CGROUP_V1):
        try:
            with open(path) as f:
                raw = f.read().strip()

            # cgroup v2 unlimited
            if raw == "max":
                return None

            value = int(raw)

            # Some runtimes expose absurdly large numbers instead
            if value >= UNLIMITED_THRESHOLD:
                return None

            return value

        except (FileNotFoundError, ValueError):
            continue

    return None


limit = cgroup_mem_limit_bytes()

# Rough baseline:
# - sync workers: ~150–300MB each for typical Python apps
# - tune based on your actual RSS under load
MEM_PER_WORKER = 200 * 1024 * 1024

workers = max(1, limit // MEM_PER_WORKER) if limit else 2

Second, Python on glibc can show alarming RSS growth that looks like a leak but is allocator fragmentation— glibc's malloc keeps per-thread arenas that inflate resident memory under concurrency. Capping the arenas often drops RSS enough to stop the kills:

Dockerfile (Python — tame glibc arena fragmentation)

ENV MALLOC_ARENA_MAX=2

Go gets it more right than Node and Python, but large hosts still trip it up. Without GOMEMLIMIT, Go's GC will happily let the heap drift past your container limit on a 128 GB host before triggering a full collection. Since Go 1.19, you can fix this with one env var:

Dockerfile (Go — tell the GC its ceiling)

# GOMEMLIMIT is advisory: GC applies backpressure before this.
# Set to ~90% of the container limit — Go's non-heap overhead is small.
ENV GOMEMLIMIT=460MiB

It's a soft ceiling, not a hard cap — Go can exceed it briefly in bursts — so keep the container --memory limit 10–15% above this.

For completeness, the JVM solved this years ago: it's container-aware by default on modern JDKs, and you size the heap as a percentage of the limit with -XX:MaxRAMPercentage=75.0. If you're on Node or Python, you have to do that math yourself.

Debugging a 137: the commands that find the real killer

Don't guess. Walk the evidence in order — each command rules out a cause.

debug-137.sh (the order that actually narrows it down)

# 1. Confirm the exit code and which killer fired
docker inspect my-container \
  --format '{{.State.ExitCode}} {{.State.OOMKilled}}'

# 2. If OOMKilled is false, ask the kernel directly
dmesg -T | grep -i -E "killed process|out of memory"
# Look for: "Killed process 12345 (node) total-vm:..., anon-rss:524288kB"
# anon-rss is how much RAM it was using when it died.

# 3. Watch live memory vs the limit (reproduce while this runs)
docker stats my-container --no-stream
# MEM USAGE / LIMIT  →  511MiB / 512MiB means you're pinned at the cap

# 4. What limit is actually in effect?
docker inspect my-container --format '{{.HostConfig.Memory}}'
# 0 means NO limit set → the host kernel is your only backstop

# 5. Rule out the polite-shutdown-turned-violent case
docker logs my-container | tail   # SIGTERM ignored → docker stop SIGKILLs after timeout

If step 4 returns 0, that alone explains an OOMKilled: false 137: with no cgroup limit, the container can grow until the host is starved and the global OOM killer steps in. Set a limit and the failure at least becomes legible — OOMKilled: true, killed at a number you chose.

I still remember one production incident where docker stats showed the container pinned at 511MiB/512MiB right before it died. That single snapshot told us everything — the runtime was completely unaware of the limit and had grown aggressively until the cgroup killed it.

Why “just raise the memory limit” usually doesn't fix it

Raising the limit is the first thing everyone tries, and it's right often enough to be dangerous. It buys time when you were genuinely a little under-provisioned. It does nothing in the two cases that actually cause most repeat 137s:

A cgroup-unaware runtime.Covered above — the process doesn't know about the old limit orthe new one, so a bigger number just moves the cliff further out. It'll walk to that one too.
A real leak.If memory grows monotonically with uptime or request count, no limit is high enough. You've turned a fast crash into a slow one and made it harder to diagnose because it now takes hours instead of minutes.

Raising the limit is the correct fix in exactly one situation: the runtime is correctly sized, memory is stable(it plateaus, doesn't climb), and the plateau simply sits a bit above your cap. Then you were under-provisioned, and a higher limit — or right-sizing your ECS task or VM — is the answer. Sizing container memory deliberately is the flip side of the same problem we walk through in how we reduced an AWS bill by 40% without rewriting the application, where over- and under-provisioning both cost you.

The honest rule: if you don't know whether memory is stable or climbing, you're not ready to change the limit. Watch docker stats across a load cycle first.

We once raised the limit from 512M to 2G in a Python service thinking it would solve the problem. The container lasted 40 minutes longer before crashing again. The real fix came only after we capped the number of Gunicorn workers based on actual cgroup memory.

Fixing it: align the runtime to the limit

The durable fix is a loop, not a single setting. Measure, align, cap, verify.

Measure the real working set. Run under realistic load and watch docker stats. Note where memory plateaus — that's your floor.
Set the container limit a margin above the plateau (a working set of ~400 MB → a 512 MB limit). In Compose:
compose.yaml (set the limit explicitly)
```
services:
  api:
    image: myapp:latest
    deploy:
      resources:
        limits:
          memory: 512M
```
Tell the runtime that limit — --max-old-space-size for Node, worker math + MALLOC_ARENA_MAX for Python, MaxRAMPercentagefor the JVM. The runtime's ceiling must sit below the container limit, with headroom for non-heap memory (native buffers, thread stacks, the runtime itself). Heap = container limit is a guaranteed 137; aim for 75–80%.
Verify under load, then check it didn't just move the problem onto a heavier garbage-collection cost or a slower query path.

When the container keeps restarting — the OOM loop

A single 137 is a bug. A container that keeps restartingwith 137 is usually a self-inflicted loop, and it's worth recognizing the shape because it pages people at 3 a.m.

State diagram showing the Docker OOM restart loop: a container either reaches Running and crosses the cgroup memory limit (exit 137), or dies during warmup before ever reaching Running, then the restart policy fires it back to Starting — creating an infinite crash loop. — Dashed arc = the warmup shortcut that never reaches Running. Both paths lead to OOMKilled, restart fires, repeat.

The loop happens when the work that triggers the OOM also happens during startup — loading a large model, warming a cache, reading a big file into memory. The container boots, allocates past the limit while warming up, gets killed at 137, and the restart: alwayspolicy (or your orchestrator) dutifully starts it again into the exact same wall. Now it's flapping, burning CPU on repeated cold starts, and your health checks never go green.

Two things break the loop: fix the startup allocation (stream the file, lazy-load the model, raise the limit enough to survive warmup) andset a backoff so a crashing container doesn't hammer restarts. restart: on-failure with a sane max, rather than always, at least stops the tight spin.

One of our edtech services entered a brutal restart loop after we introduced a large in-memory cache warmup. It took us embarrassingly long to realize the OOM was happening before the health check could ever pass.

One thing that compounds the loop silently: a large image. A container flapping with a 1 GB image pulls that full gigabyte on every restart, doubling recovery time on each cycle. Keeping image size down — covered in how to reduce Docker image size — shortens the flap window and gets the service back to healthy faster.

137s in Kubernetes and ECS

In Docker, the cgroup does the killing and docker inspect tells you what happened. In production orchestrators, both of those assumptions break.

Kubernetesintroduces a second OOM path that's easy to miss. A container can get a 137 because it hit its own limits — or because the noderan out of memory and the kubelet started evicting pods to save itself. These look identical from the container's perspective. OOMKilled: truedoesn't tell you which.

The eviction order follows QoS class. Guaranteed pods — where requests == limits — are evicted last. Burstable pods go before them. BestEffort pods (no requests or limits set) go first. In practice: set requests and limitsequal for memory on every critical service. Yes, it reserves capacity on lightly-loaded nodes. The alternative is watching your pod get evicted during a traffic spike because a neighbor consumed the node's RAM.

docker inspect is useless here — the pod has already been replaced. Check the orchestrator:

k8s-oom.sh (find the real kill reason)

# Last terminated state of the container
kubectl describe pod <pod-name> | grep -A 10 "Last State"
# Reason: OOMKilled, Exit Code: 137

# Check whether node memory pressure triggered the eviction
kubectl describe node <node-name> | grep -A 5 "MemoryPressure"

Amazon ECS splits into two different enforcement models depending on compute type.

On ECS + EC2, memory enforcement works like plain Docker: the cgroup limit is set by the task definition's container-level memory field. If a container exceeds it, the cgroup kills it — 137 with OOMKilled: true. But if you set memory only at the tasklevel and not per-container, Docker treats it as a soft limit. One greedy container can eat into its neighbors' headroom and push the entire EC2 host toward the kernel OOM killer — back to 137 with OOMKilled: false. Always set memory at the container level.

On ECS + Fargate, the enforcement is harder. Fargate reserves the task memory at the infrastructure layer; exceeding it terminates the task. There's no overcommit — you pay for every MB you declare, and you get killed faster if you're under-provisioned. dmesgdoesn't exist in your context. The stop reason lives in ECS events:

ecs-oom.sh (Fargate task stop reason)

aws ecs describe-tasks \
  --cluster my-cluster \
  --tasks <task-arn> \
  --query 'tasks[0].{stop:stoppedReason,containers:containers[].{name:name,exit:exitCode,reason:reason}}'
# exitCode 137 in containers[] confirms the OOM kill

Across both orchestrators: set container-level limits explicitly, and check the platform's own event log first — not docker inspect.

When NOT to cap container memory

Limits are good defaults, but a wrong limit causes the exact crash you're trying to prevent. Skip or loosen the cap when:

You haven't measured the working set yet. A limit pulled from a round number instead of docker stats under load is just a randomly placed cliff. Measure first, cap second.
The workload is legitimately spiky. Batch jobs, report generation, and ETL steps can have a working set many times their idle memory. Capping to the idle number guarantees a 137 the first time real data shows up. Size for the peak or run these unconstrained on a dedicated host.
It's a single-tenant host running one workload. If the box exists to run this one container, a tight cgroup limit adds a failure mode without adding isolation you need — the host limit already bounds it.
The workload loads a large model at startup. LLM inference and embedding services load multi-GB model weights before serving a single request. That startup footprint isthe working set — it doesn't grow with traffic. Cap below the model size and you get a 137 before the first request completes. Size to the model, not your intuition about what feels reasonable.

The point of a memory limit is isolation between noisy neighbors and a legible failure mode — not micro-optimizing RAM. If a limit isn't buying you either, a too-tight one is pure downside.

Some teams I've worked with develop a religious attachment to tiny memory limits. They end up with constant OOM kills on legitimate batch jobs. A memory limit should serve a purpose — isolation or predictability — not become a source of self-inflicted pain.

Summary

Exit code 137 means SIGKILL (128 + 9). For a container that dies on its own, it's almost always memory.
OOMKilled: falseis not “not memory.” It means the host kernelkilled it (no limit set, or host pressure), not Docker's cgroup killer. Confirm with dmesg | grep -i oom.
The repeat-offender cause is a cgroup-unaware runtime. Node and Python read the host's RAM, size themselves for a machine they don't have, and blow past your limit. Raising the limit just moves the cliff.
Align the runtime to the limit: --max-old-space-size (Node), worker-sizing + MALLOC_ARENA_MAX (Python), MaxRAMPercentage (JVM), GOMEMLIMIT (Go). Keep the runtime ceiling at ~75–80% of the container limit.
A restart loop = the OOM happens during warmup.Fix the startup allocation and add restart backoff; don't let restart: always hammer it.
Measure before you cap. A limit set without watching docker stats under load is a randomly placed cliff.

My rule after shipping dozens of services is simple: if you haven't measured the working set under load and aligned the runtime to the cgroup limit, the container isn't ready to ship.

Frequently Asked Questions

It means the container's main process was terminated by SIGKILL (signal 9). Linux reports signal deaths as 128 + signal number, and 128 + 9 = 137. For a container that exits on its own, the cause is almost always an out-of-memory kill — either Docker's cgroup limit or the host kernel's OOM killer.

Why Your Docker Container Gets OOMKilled (Exit Code 137)

What exit code 137 actually means

Why `docker inspect` says OOMKilled: false — the two different OOM killers

The real cause: your runtime has no idea what the cgroup limit is

How much memory your container thinks it has

Debugging a 137: the commands that find the real killer

Why “just raise the memory limit” usually doesn't fix it

Fixing it: align the runtime to the limit

When the container keeps restarting — the OOM loop

137s in Kubernetes and ECS

When NOT to cap container memory

Summary

Frequently Asked Questions

Reduce Docker Image Size: Measure First, Then Cut

Why Your Docker Container Gets OOMKilled (Exit Code 137)

What exit code 137 actually means

Why docker inspect says OOMKilled: false — the two different OOM killers

The real cause: your runtime has no idea what the cgroup limit is

How much memory your container thinks it has

Debugging a 137: the commands that find the real killer

Why “just raise the memory limit” usually doesn't fix it

Fixing it: align the runtime to the limit

When the container keeps restarting — the OOM loop

137s in Kubernetes and ECS

When NOT to cap container memory

Summary

Frequently Asked Questions

What does exit code 137 mean in Docker?

Why does docker inspect show OOMKilled: false when the exit code is 137?

How do I fix a Docker container that keeps getting OOMKilled?

Why does raising the memory limit not stop exit code 137?

Is exit code 137 always a memory problem?

Reduce Docker Image Size: Measure First, Then Cut

Why `docker inspect` says OOMKilled: false — the two different OOM killers