Skip to content

josepdcs/kubectl-prof

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

163 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ”₯ Kubectl Prof

Build Status Go Report Card License Release GitHub stars

Profile your Kubernetes applications with zero overhead and zero modifications πŸš€

kubectl-prof is a powerful kubectl plugin that enables low-overhead profiling of applications running in Kubernetes environments. Generate FlameGraphs, JFR files, thread dumps, heap dumps, and many other diagnostic outputs without modifying your pods.

✨ Key Features:

  • 🎯 Zero modification - Profile running pods without any changes to your deployment
  • 🌐 Multi-language support - Java, Go, Python, Ruby, Node.js, Rust, Clang/Clang++, PHP, .NET
  • πŸ“Š Multiple output formats - FlameGraphs, JFR, SpeedScope, thread dumps, heap dumps, GC dumps, memory dumps, and more
  • ⚑ Low overhead - Minimal impact on running applications
  • πŸ”„ Continuous profiling - Support for both discrete and continuous profiling modes

This is an open source fork of kubectl-flame with enhanced features and bug fixes.

πŸ“‹ Table of Contents

πŸ“‹ Requirements

Supported Languages πŸ’»

Language Status Tools Available
β˜• Java (JVM) βœ… Fully Supported async-profiler, jcmd
🐹 Go βœ… Fully Supported eBPF profiling, pprof
🐍 Python βœ… Fully Supported py-spy, memray
πŸ’Ž Ruby βœ… Fully Supported rbspy
πŸ“— Node.js βœ… Fully Supported eBPF profiling, perf
πŸ¦€ Rust βœ… Fully Supported cargo-flamegraph
βš™οΈ Clang/Clang++ βœ… Fully Supported eBPF profiling, perf
🐘 PHP βœ… Fully Supported phpspy
🟣 .NET (Core/5+) βœ… Fully Supported dotnet-trace, dotnet-gcdump, dotnet-counters, dotnet-dump

Container Runtimes 🐳

  • Containerd - --runtime=containerd (default)
  • CRI-O - --runtime=crio

eBPF Profiling Tools πŸ”§

For eBPF profiling (Go, Node.js, Clang/Clang++), two tools are available:

BPF (default) - BCC-based profiler

  • Requirements: Kernel headers or kheaders module (/lib/modules)
  • Usage: Automatically used by default (no --tool flag needed)
  • Compatibility: Works on most systems with kernel headers installed

BTF - CO-RE eBPF profiler (NEW - Experimental)

  • Requirements:
    • Linux kernel 5.2+ with BTF enabled (check /sys/kernel/btf/vmlinux)
    • BPF CPU v2 support (kernel 5.2+)
  • Usage: Add --tool btf flag to your command
  • Benefits:
    • βœ… No kernel headers required - works on DigitalOcean and other cloud providers without kheaders
    • βœ… Uses CO-RE (Compile Once - Run Everywhere) technology
    • βœ… Portable across different kernel versions without recompilation
    • βœ… Smaller Docker image size
  • Note: Most modern distributions (Ubuntu 20.04+, RHEL 8+, etc.) include BTF by default and meet the kernel requirements

Example using BTF:

kubectl prof my-pod -t 1m -l go --tool btf

πŸš€ Quick Start

Profile a Java application for 1 minute and save the FlameGraph:

kubectl prof my-pod -t 1m -l java

Profile a Python application and save to a specific location:

kubectl prof my-pod -t 1m -l python --local-path=/tmp

Profile a Rust application with cargo-flamegraph:

kubectl prof my-pod -t 1m -l rust

Profile a PHP application and generate a FlameGraph:

kubectl prof my-pod -t 1m -l php

Profile multiple pods using a label selector:

kubectl prof --selector app=myapp -t 5m -l java -o jfr

πŸ“– Usage

β˜• Java Profiling

Basic FlameGraph Generation

Profile a Java application for 5 minutes and generate a FlameGraph:

kubectl prof my-pod -t 5m -l java -o flamegraph --local-path=/tmp

πŸ’‘ Tip: If --local-path is omitted, the FlameGraph will be saved to the current directory.

Alpine-based Containers

For Java applications running in Alpine-based containers, use the --alpine flag:

kubectl prof mypod -t 1m -l java -o flamegraph --alpine

⚠️ Note: The --alpine flag is only required for Java applications.

JFR Output Generation

Using jcmd (default for JFR):

kubectl prof mypod -t 5m -l java -o jfr

Using async-profiler:

kubectl prof mypod -t 5m -l java -o jfr --tool async-profiler

Thread Dump

Generate a thread dump using jcmd:

kubectl prof mypod -l java -o threaddump

Heap Dump

Generate a heap dump in hprof format:

kubectl prof mypod -l java -o heapdump --tool jcmd

Heap dumps can be large files. Use --output-split-size to split the result into smaller chunks for easier transfer (default: 50M):

# Split into 100 MB chunks
kubectl prof mypod -l java -o heapdump --tool jcmd --output-split-size=100M

# Split into 1 GB chunks
kubectl prof mypod -l java -o heapdump --tool jcmd --output-split-size=1G

πŸ’‘ Tip: The value follows the format accepted by the split Unix command (e.g. 50M, 200M, 1G).

Heap Histogram

Generate a heap histogram:

kubectl prof mypod -l java -o heaphistogram --tool jcmd

Available Event Types for Java

When using async-profiler, you can specify different event types:

# CPU profiling (default: ctimer)
kubectl prof mypod -t 5m -l java -e cpu

# Memory allocation profiling
kubectl prof mypod -t 5m -l java -e alloc

# Lock contention profiling
kubectl prof mypod -t 5m -l java -e lock

Supported events: cpu, alloc, lock, cache-misses, wall, itimer, ctimer

Additional Arguments for async-profiler

You can pass additional command-line arguments to async-profiler using the --async-profiler-args flag. This is useful for enabling specific profiling modes or customizing profiler behavior:

# Wall-clock profiling in per-thread mode (most useful for wall-clock profiling)
kubectl prof mypod -t 5m -l java -e wall --async-profiler-args -t

# Multiple additional arguments
kubectl prof mypod -t 5m -l java -e alloc --async-profiler-args -t --async-profiler-args --alloc=2m

# Combine with other options
kubectl prof mypod -t 5m -l java -e wall -o flamegraph --async-profiler-args -t

Common use cases:

  • -t - Per-thread mode (recommended for wall-clock profiling)
  • --alloc=SIZE - Set allocation profiling interval
  • --lock=DURATION - Set lock profiling threshold
  • --cstack=MODE - Control how native frames are captured

πŸ’‘ Tip: Refer to the async-profiler documentation for a complete list of available arguments and their descriptions.


🐍 Python Profiling

FlameGraph Generation

kubectl prof mypod -t 1m -l python -o flamegraph --local-path=/tmp

Thread Dump

kubectl prof mypod -l python -o threaddump --local-path=/tmp

SpeedScope Format

Generate a SpeedScope compatible file:

kubectl prof mypod -t 1m -l python -o speedscope --local-path=/tmp

Memory Profiling with Memray

Memray is a memory profiler for Python that tracks every allocation and deallocation made by your code. Unlike py-spy (which profiles CPU usage), memray reveals where your application allocates memory, helping you find memory leaks, reduce peak memory usage, and understand allocation patterns.

Memray attaches to running Python processes via GDB injection -- your application keeps running with zero downtime. No restart, no code changes, no instrumentation required.

Note: You must specify --tool memray explicitly. The default Python profiling tool remains py-spy.

Requirements:

  • Capabilities: SYS_PTRACE and SYS_ADMIN are required (for ptrace-based attach and nsenter into the target container's namespaces). Both are added automatically when --tool memray is used -- no extra flags needed.
  • Python versions: 3.10, 3.11, 3.12, 3.13 (glibc-based images only)
  • Not supported: Alpine/musl-based target containers, statically-linked Python builds

Output types:

Output Flag Format Description
Memory flamegraph -o flamegraph HTML Interactive flamegraph showing allocation call stacks and sizes
Allocation summary -o summary Text Tabular summary of the largest allocators by function

Memory flamegraph (HTML):

kubectl prof mypod -t 1m -l python --tool memray -o flamegraph --local-path=/tmp

The output is a self-contained HTML file you can open in any browser. Wider frames indicate functions responsible for more memory allocations.

Allocation summary (text):

kubectl prof mypod -t 1m -l python --tool memray -o summary --local-path=/tmp

The output is a text file listing the top allocators by total bytes allocated.

Long profiling sessions and the heartbeat interval:

When profiling for longer durations (e.g. 5-10 minutes), network proxies or load balancers in front of your Kubernetes API server may terminate idle connections. Memray emits periodic heartbeat events to keep the log stream alive. The default interval is 30 seconds. You can adjust it with --heartbeat-interval:

kubectl prof mypod -t 10m -l python --tool memray -o flamegraph --heartbeat-interval=15s

Targeting a specific process:

If your pod runs multiple Python processes, use --pid or --pgrep to target a specific one:

kubectl prof mypod -t 2m -l python --tool memray -o flamegraph --pid 1234
kubectl prof mypod -t 2m -l python --tool memray -o flamegraph --pgrep my-worker

🐹 Go Profiling


pprof Profiling (no privileges required) β€” recommended

If your Go application exposes the standard net/http/pprof endpoint, you can profile it directly without eBPF or any elevated privileges (HostPID, SYS_ADMIN, or privileged containers are not needed):

πŸ’‘ How it works: The agent pod connects to the target pod's net/http/pprof HTTP endpoint over the network, downloads the binary profile (.pb.gz) and delivers it to your machine. No kernel-level access is required. Visualization is done locally with go tool pprof.

CPU profiling
# CPU profile β€” raw protobuf (.pb.gz), default
kubectl prof mypod -t 30s -l go --tool pprof

# Same result, explicit flag (-o raw and -o pprof are aliases)
kubectl prof mypod -t 30s -l go --tool pprof -o raw
kubectl prof mypod -t 30s -l go --tool pprof -o pprof

# Custom pprof port (default: 6060)
kubectl prof mypod -t 30s -l go --tool pprof --pprof-port 8080

Open the result locally β€” the -http=: flag starts a browser UI with all views (flamegraph, graph, top, source…):

go tool pprof -http=: golang-deployment-86f57ddb4-h9fvz-agent-raw-pprof-1-2026-04-21T08_48_33Z.pb.gz

Or use the interactive CLI shell:

go tool pprof cpu.pb.gz
# then inside the pprof shell:
(pprof) top
(pprof) list MyFunc  # annotated source
Memory heap dump

Capture a snapshot of the heap allocations from /debug/pprof/heap:

kubectl prof mypod -l go --tool pprof -o heapdump
go tool pprof -http=: golang-deployment-86f57ddb4-h9fvz-agent-heapdump-pprof-1-2026-04-21T08_48_33Z.out
Allocation profile

Capture the cumulative allocation profile from /debug/pprof/allocs (all allocations since the process started, not just live objects):

kubectl prof mypod -l go --tool pprof -o allocsdump
go tool pprof -http=: golang-deployment-86f57ddb4-h9fvz-agent-allocsdump-pprof-1-2026-04-21T08_48_33Z.out

πŸ’‘ Heap vs Allocs: /heap shows only live objects (useful for finding memory leaks), while /allocs shows all objects ever allocated (useful for finding allocation hot-spots and GC pressure).

Goroutine dump

Capture the current state of all goroutines from /debug/pprof/goroutine:

kubectl prof mypod -l go --tool pprof -o goroutinedump
go tool pprof -http=: golang-deployment-86f57ddb4-h9fvz-agent-goroutinedump-pprof-1-2026-04-21T08_48_33Z.pb.gz

Available output formats (pprof):

Format Flag Extension Endpoint Notes
Raw protobuf -o raw .pb.gz /debug/pprof/profile default
Pprof (alias) -o pprof .pb.gz /debug/pprof/profile same as raw
Heap dump -o heapdump .out /debug/pprof/heap
Allocs dump -o allocsdump .out /debug/pprof/allocs
Goroutine dump -o goroutinedump .pb.gz /debug/pprof/goroutine

πŸ’‘ Visualize locally with a single command: open the downloaded .pb.gz file in your browser with all visualization options (flamegraph, top, source, graph…):

go tool pprof -http=: golang-deployment-86f57ddb4-h9fvz-agent-raw-pprof-1-2026-04-21T08_48_33Z.pb.gz

This starts a local HTTP server and opens the browser automatically. Navigate to View β†’ Flame Graph for an interactive flamegraph.


Cross-namespace profiling and NetworkPolicy

The pprof profiler does not require any kernel privileges, but it does require network connectivity between the agent pod and the target pod. When both pods run in different namespaces (e.g. the target app in my-app and the profiling agent in profiling), any default-deny NetworkPolicy will block the connection.

Apply a NetworkPolicy in the target application's namespace to allow ingress from the profiling namespace:

# Allow ingress on the pprof port from the profiling namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-pprof-from-profiling
  namespace: my-app          # namespace where the target pod runs
spec:
  podSelector: {}            # applies to all pods in the namespace
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: profiling   # the profiling agent namespace
      ports:
        - protocol: TCP
          port: 6060         # default pprof port (adjust if using --pprof-port)

If you want to restrict it to specific pods in the target namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-pprof-from-profiling
  namespace: my-app
spec:
  podSelector:
    matchLabels:
      app: my-go-service     # only allow profiling of pods with this label
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: profiling
      ports:
        - protocol: TCP
          port: 6060

⚠️ Note: The namespace label kubernetes.io/metadata.name is set automatically by Kubernetes 1.21+. For older clusters, add the label manually: kubectl label namespace profiling kubernetes.io/metadata.name=profiling.


eBPF Profiling (default β€” requires privileges)

Profile a Go application for 1 minute using eBPF (requires SYS_ADMIN or a privileged pod):

kubectl prof mypod -t 1m -l go -o flamegraph

Output formats (eBPF):

  • flamegraph - FlameGraph visualization (SVG)
  • raw - Collapsed stack traces (.txt)

πŸ“— Node.js Profiling

FlameGraph Generation

kubectl prof mypod -t 1m -l node -o flamegraph

πŸ’‘ Tip: For JavaScript symbols to be resolved, run your Node.js process with the --perf-basic-prof flag.

Heap Snapshot

Generate a heap snapshot:

kubectl prof mypod -l node -o heapsnapshot

⚠️ Requirements: Your Node.js app must be run with --heapsnapshot-signal=SIGUSR2 (default) or --heapsnapshot-signal=SIGUSR1.

If using SIGUSR1:

kubectl prof mypod -l node -o heapsnapshot --node-heap-snapshot-signal=10

Heap snapshots can grow large for memory-heavy applications. Use --output-split-size to split the result into smaller chunks (default: 50M):

kubectl prof mypod -l node -o heapsnapshot --output-split-size=200M

πŸ“š Learn more: Node.js Heap Snapshots


πŸ’Ž Ruby Profiling

Profile a Ruby application:

kubectl prof mypod -t 1m -l ruby -o flamegraph

Available output formats:

  • flamegraph - FlameGraph visualization
  • speedscope - SpeedScope format
  • callgrind - Callgrind format

πŸ¦€ Rust Profiling

Profile a Rust application using cargo-flamegraph (default and recommended):

kubectl prof mypod -t 1m -l rust -o flamegraph

πŸ”₯ cargo-flamegraph Benefits

kubectl-prof uses cargo-flamegraph as the default profiling tool for Rust applications, offering several advantages:

  • πŸ“Š Rust-optimized profiling - Specifically designed for Rust applications with excellent symbol resolution
  • 🎨 Beautiful visualizations - Generates clean, colorized FlameGraphs with Rust-specific color palette
  • ⚑ Low overhead - Minimal performance impact during profiling
  • πŸ” Deep insights - Captures detailed stack traces including inline functions and generics
  • πŸ› οΈ Built on perf - Leverages the powerful Linux perf tool under the hood

Available output format:

  • flamegraph - Interactive FlameGraph visualization (SVG format)

βš™οΈ Clang/Clang++ Profiling

Clang:

kubectl prof mypod -t 1m -l clang -o flamegraph

Clang++:

kubectl prof mypod -t 1m -l clang++ -o flamegraph

🐘 PHP Profiling

Profile a PHP 7+ application using phpspy, a low-overhead sampling profiler:

FlameGraph Generation

kubectl prof mypod -t 1m -l php -o flamegraph --local-path=/tmp

Raw Output

Generate raw stack-trace data that can be post-processed into a FlameGraph:

kubectl prof mypod -t 1m -l php -o raw --local-path=/tmp

Available output formats:

  • flamegraph - Interactive FlameGraph visualization (SVG format)
  • raw - Raw stack traces in folded format

⚠️ Requirements: The SYS_PTRACE capability is required. It is added automatically by kubectl-prof.

πŸ’‘ Tip: phpspy works with PHP 7+ processes and requires no modifications to your application or PHP configuration.


🟣 .NET Profiling

kubectl-prof supports four specialised tools from the .NET diagnostics suite for profiling .NET Core / .NET 5+ applications running in Kubernetes.

⚠️ Requirements: The target container must be running a .NET Core / .NET 5+ application with the .NET diagnostic socket enabled (default behaviour).


πŸ”₯ CPU Traces β€” dotnet-trace

dotnet-trace captures CPU samples and runtime events through the EventPipe mechanism. It is the default tool for .NET when no --tool flag is specified.

SpeedScope format (default):

kubectl prof mypod -t 30s -l dotnet -o speedscope --local-path=/tmp

The output is a .speedscope.json file that can be loaded directly at speedscope.app for interactive flame-graph analysis.

Raw nettrace format:

kubectl prof mypod -t 1m -l dotnet -o raw --local-path=/tmp

The output is a .nettrace binary file that can be opened with:

Using --tool flag explicitly:

kubectl prof mypod -t 30s -l dotnet --tool dotnet-trace -o speedscope
Flag Output file Visualiser
-o speedscope .speedscope.json speedscope.app
-o raw .nettrace PerfView, Visual Studio, dotnet-trace convert

πŸ—‘οΈ GC Heap Dump β€” dotnet-gcdump

dotnet-gcdump captures a snapshot of the managed (GC) heap. It is a lightweight alternative to a full memory dump β€” only managed objects are captured, so the file is much smaller than a .dmp.

kubectl prof mypod -l dotnet --tool dotnet-gcdump -o gcdump --local-path=/tmp

For large heaps, use --output-split-size to split the result into smaller chunks (default: 50M):

kubectl prof mypod -l dotnet --tool dotnet-gcdump -o gcdump --output-split-size=200M --local-path=/tmp

πŸ’‘ Tip: dotnet-gcdump is the recommended starting point for memory analysis. Use dotnet-dump only when you need native frames or a complete memory picture.

The output is a .gcdump file that can be opened with:

Quick CLI report from the dump file:

dotnet-gcdump report ./agent-gcdump-<pid>-1.gcdump

πŸ“Š Performance Counters β€” dotnet-counters

dotnet-counters collects runtime and application performance metrics (CPU usage, GC collections, exception rates, thread-pool queue length, etc.) over a configurable duration and writes them to a JSON file.

kubectl prof mypod -t 30s -l dotnet --tool dotnet-counters -o counters --local-path=/tmp

The output is a .json file structured as a time series of counter values. It can be:

  • Inspected directly β€” plain JSON, human-readable
  • Visualised with PerfView β€” open the JSON report
  • Post-processed with any standard JSON tooling (jq, Python, etc.)

Example: print a quick summary with jq:

jq '.events[] | {name: .name, value: .value}' ./agent-counters-<pid>-1.json

Counters captured by default (from the dotnet-common + dotnet-sampled-thread-time profiles):

Counter Description
cpu-usage Total CPU usage (%)
working-set Working set memory (MB)
gc-heap-size GC heap size (MB)
gen-0-gc-count Gen 0 GC collections / interval
gen-1-gc-count Gen 1 GC collections / interval
gen-2-gc-count Gen 2 GC collections / interval
exception-count Exceptions thrown / interval
threadpool-queue-length Thread-pool work-item queue length
active-timer-count Active System.Threading.Timer instances

πŸ’Ύ Full Memory Dump β€” dotnet-dump

dotnet-dump captures a point-in-time full memory dump (.dmp) of the process, including both managed and native frames. This is the most comprehensive diagnostic artefact β€” use it for crash analysis, deadlock investigation, or when dotnet-gcdump does not capture enough context.

⚠️ Note: dotnet-dump does not accept a --duration flag β€” it captures the dump immediately when invoked. The -t flag is ignored for this tool.

kubectl prof mypod -l dotnet --tool dotnet-dump -o dump --local-path=/tmp

Full memory dumps can be very large (several GB for production processes). Use --output-split-size to split the result into smaller chunks for easier transfer (default: 50M):

kubectl prof mypod -l dotnet --tool dotnet-dump -o dump --output-split-size=500M --local-path=/tmp

The output is a .dmp file (ELF core dump format on Linux) that can be analysed with:

  • dotnet-dump analyze β€” cross-platform interactive SOS shell:

    dotnet-dump analyze ./agent-dump-<pid>-1.dmp

    Useful SOS commands inside the session:

    > clrstack          # managed call stacks for all threads
    > dumpheap -stat    # managed heap statistics
    > gcroot <address>  # find GC roots for an object
    > threads           # list all threads
    > pe                # print last exception on each thread
    
  • Visual Studio on Windows β€” open the .dmp file for mixed managed/native debugging

  • WinDbg with the SOS extension on Windows

  • LLDB with the SOS plugin on Linux/macOS:

    lldb --core ./agent-dump-<pid>-1.dmp

πŸ—‚οΈ .NET Tools Summary

Tool flag -o / Output type Output file Default? Visualiser / Tool
dotnet-trace (default) speedscope .speedscope.json βœ… speedscope.app
dotnet-trace raw .nettrace PerfView, Visual Studio, dotnet-trace convert
dotnet-gcdump gcdump .gcdump Visual Studio, PerfView, dotnet-gcdump report
dotnet-counters counters .json PerfView, jq, Python
dotnet-dump dump .dmp dotnet-dump analyze, Visual Studio, WinDbg, LLDB

πŸ”— Further Reading


🎯 Advanced Usage

Specify Container Runtime

kubectl prof mypod -t 1m -l java --runtime crio

Supported runtimes: containerd (default), crio

Continuous Profiling

Profile continuously at 60-second intervals for 5 minutes:

kubectl prof mypod -l java -t 5m --interval 60s

πŸ“ Note: In continuous mode, a new result is produced every interval. Only the last result is available by default.

Custom Resource Limits

Set CPU and memory limits for the profiling agent pod:

kubectl prof mypod -l java -t 5m \
  --cpu-limits=1 \
  --cpu-requests=100m \
  --mem-limits=200Mi \
  --mem-requests=100Mi

Cross-Namespace Profiling

Profile a pod in a different namespace:

kubectl prof mypod -n profiling \
  --service-account=profiler \
  --target-namespace=my-apps \
  -l go

Custom Agent Image

Use a custom profiling agent image:

kubectl prof mypod -l java -t 5m \
  --image=localhost/my-agent-image-jvm:latest \
  --image-pull-policy=IfNotPresent \
  --runtime containerd

Profile Multiple Pods with Label Selector

Profile all pods matching a label selector:

kubectl prof --selector app=myapp -t 5m -l java -o jfr

⚠️ ATTENTION: Use this option with caution as it will profile ALL pods matching the selector.

Control concurrent profiling jobs:

kubectl prof --selector app=myapp -t 5m -l java -o jfr --pool-size-profiling-jobs 5

Target Specific Process

By default, kubectl-prof attempts to profile all processes in the container. To target a specific process:

Using PID:

kubectl prof mypod -l java --pid 1234

Using process name:

kubectl prof mypod -l java --pgrep java-app-process

Capabilities Configuration

For Java profiling, kubectl-prof uses PERFMON and SYSLOG capabilities by default. To use SYS_ADMIN:

kubectl prof my-pod -t 5m -l java --capabilities=SYS_ADMIN

Add multiple capabilities:

kubectl prof my-pod -t 5m -l java \
  --capabilities=SYS_ADMIN \
  --capabilities=PERFMON

Node Tolerations

Profile pods on nodes with taints by specifying tolerations:

Tolerate specific taint:

kubectl prof my-pod -t 5m -l java \
  --tolerations=node.kubernetes.io/disk-pressure=true:NoSchedule

Multiple tolerations:

kubectl prof my-pod -t 5m -l java \
  --tolerations=node.kubernetes.io/disk-pressure=true:NoSchedule \
  --tolerations=node.kubernetes.io/memory-pressure:NoExecute \
  --tolerations=dedicated=profiling:PreferNoSchedule

Toleration formats:

  • key=value:effect - Full specification
  • key:effect - Any value
  • key - Defaults to NoSchedule

πŸ“š Get Help

For a complete list of options:

kubectl prof --help

πŸ“¦ Installation

Using Krew (Recommended) πŸ”Œ

Krew is the plugin manager for kubectl.

  1. Install Krew (if not already installed)

  2. Add kubectl-prof repository and install:

kubectl krew index add kubectl-prof https://github.com/josepdcs/kubectl-prof
kubectl krew search kubectl-prof
kubectl krew install kubectl-prof/prof
kubectl prof --help

Pre-built Binaries πŸ“₯

Download pre-built binaries from the releases page.

Linux x86_64

wget https://github.com/josepdcs/kubectl-prof/releases/download/2.2.0/kubectl-prof_2.2.0_linux_amd64.tar.gz
tar xvfz kubectl-prof_2.2.0_linux_amd64.tar.gz
sudo install kubectl-prof /usr/local/bin/

macOS

wget https://github.com/josepdcs/kubectl-prof/releases/download/2.2.0/kubectl-prof_2.2.0_darwin_amd64.tar.gz
tar xvfz kubectl-prof_2.2.0_darwin_amd64.tar.gz
sudo install kubectl-prof /usr/local/bin/

Windows

Download the Windows binary from the releases page and add it to your PATH.

πŸ”¨ Building from Source

Prerequisites

  • Go 1.26 or higher
  • Make
  • Docker (for building agent containers)

Build Steps

  1. Clone and install dependencies:
go get -d github.com/josepdcs/kubectl-prof
cd $GOPATH/src/github.com/josepdcs/kubectl-prof
make install-deps
  1. Build the binary:
make build

The binary will be available in ./bin/kubectl-prof

  1. Build agent containers (optional):

Modify the DOCKER_BASE_IMAGE property in Makefile, then run:

make build-docker-agents

πŸ”§ How It Works

kubectl-prof launches a Kubernetes Job on the same node as the target pod. The profiling is performed using specialized tools based on the programming language:

Profiling Tools by Language

β˜• Java (JVM)

async-profiler - For FlameGraphs and JFR files

  • FlameGraphs: --tool async-profiler -o flamegraph (default)
  • JFR files: --tool async-profiler -o jfr
  • Collapsed/Raw: --tool async-profiler -o collapsed or -o raw
  • Event types: cpu, alloc, lock, cache-misses, wall, itimer, ctimer (default)

jcmd - For JFR, thread dumps, heap dumps

  • JFR files: --tool jcmd -o jfr (default for jcmd)
  • Thread dumps: --tool jcmd -o threaddump
  • Heap dumps: --tool jcmd -o heapdump
  • Heap histogram: --tool jcmd -o heaphistogram

🐍 Python

py-spy - Low-overhead Python profiler

  • FlameGraphs: -o flamegraph (default)
  • Thread dumps: -o threaddump
  • SpeedScope: -o speedscope
  • Raw output: -o raw

memray - Python memory profiler (--tool memray)

  • Memory flamegraph (HTML): -o flamegraph
  • Allocation summary (text): -o summary
  • Attaches to running processes via GDB injection (zero downtime)
  • Requires SYS_PTRACE + SYS_ADMIN capabilities (added automatically)
  • Supported target Python versions: 3.10, 3.11, 3.12, 3.13 (glibc-based only)

🐹 Go

pprof - Native Go HTTP profiling (no privileges required)

  • Connects directly to the application's net/http/pprof endpoint over HTTP
  • No HostPID, SYS_ADMIN, or privileged access β€” only needs network connectivity to the target pod
  • The binary profile (.pb.gz) is delivered to your machine; visualization is done locally with go tool pprof
  • Cross-namespace use requires a NetworkPolicy allowing ingress on the pprof port from the profiling namespace
  • Usage: --tool pprof
  • Custom port: --pprof-port <port> (default: 6060)

Output formats (pprof) β€” all produce .pb.gz compatible with go tool pprof:

Format Flag Endpoint queried Notes
Raw protobuf -o raw /debug/pprof/profile default, CPU profile
Pprof (alias) -o pprof /debug/pprof/profile same as raw
Heap dump -o heapdump /debug/pprof/heap memory allocations, .out
Allocs dump -o allocsdump /debug/pprof/allocs cumulative allocations, .out
Goroutine dump -o goroutinedump /debug/pprof/goroutine goroutine state

πŸ’‘ Visualize locally with a single command: open the downloaded .pb.gz file in your browser with all visualization options (flamegraph, top, source, graph…):

go tool pprof -http=: golang-deployment-86f57ddb4-h9fvz-agent-raw-pprof-1-2026-04-21T08_48_33Z.pb.gz

This starts a local HTTP server and opens the browser automatically. Navigate to View β†’ Flame Graph for an interactive flamegraph.

Examples:

kubectl prof my-pod -t 30s -l go --tool pprof                          # CPU profile (default)
kubectl prof my-pod -t 30s -l go --tool pprof -o raw                   # CPU profile, explicit
kubectl prof my-pod -t 30s -l go --tool pprof -o pprof                 # CPU profile alias
kubectl prof my-pod       -l go --tool pprof -o heapdump               # heap snapshot (live objects)
kubectl prof my-pod       -l go --tool pprof -o allocsdump             # cumulative allocation profile
kubectl prof my-pod       -l go --tool pprof -o goroutinedump          # goroutine dump
kubectl prof my-pod -t 30s -l go --tool pprof --pprof-port 8080        # custom port

eBPF Profiling - Two options available (require SYS_ADMIN / privileged pod):

  1. BPF (default) - BCC-based profiler

    • Uses BCC tools with runtime compilation
    • Requires kernel headers (/lib/modules)
    • Usage: No --tool flag needed (default)
  2. BTF - CO-RE eBPF profiler

    • Uses libbpf-tools with CO-RE support
    • No kernel headers required - only needs BTF (available on modern kernels)
    • Usage: Add --tool btf flag
    • Example: kubectl prof my-pod -t 1m -l go --tool btf

Output formats (eBPF tools):

  • FlameGraphs: -o flamegraph (default)
  • Raw output: -o raw

πŸ¦€ Rust

cargo-flamegraph - Rust-optimized profiling tool (default)

  • FlameGraphs: --tool cargo-flamegraph -o flamegraph (default)
  • Rust-specific color palette and symbol resolution
  • Low overhead, built on perf

πŸ’Ž Ruby

rbspy - Ruby sampling profiler

  • FlameGraphs: -o flamegraph (default)
  • SpeedScope: -o speedscope
  • Callgrind: -o callgrind

🐘 PHP

phpspy - Low-overhead sampling profiler for PHP 7+

  • FlameGraphs: -o flamegraph (default)
  • Raw output: -o raw

Output formats:

  • flamegraph - Interactive FlameGraph visualization (SVG format)
  • raw - Raw stack traces in folded format

🟣 .NET (Core / .NET 5+)

Four tools from the .NET diagnostics suite are available, each targeting a different diagnostic scenario:

dotnet-trace β€” CPU and runtime event tracing (default tool for .NET)

  • SpeedScope: --tool dotnet-trace -o speedscope (default) β†’ .speedscope.json
  • Raw nettrace: --tool dotnet-trace -o raw β†’ .nettrace
  • Uses EventPipe; zero JVM-agent overhead

dotnet-gcdump β€” Lightweight GC heap snapshot

  • GC heap dump: --tool dotnet-gcdump -o gcdump β†’ .gcdump
  • Captures managed objects only; much smaller than a full dump

dotnet-counters β€” Real-time performance counter collection

  • Counters: --tool dotnet-counters -o counters β†’ .json
  • Captures CPU, GC, thread-pool, exception rate and other runtime metrics

dotnet-dump β€” Full process memory dump

  • Full dump: --tool dotnet-dump -o dump β†’ .dmp
  • Point-in-time; includes both managed and native frames
  • Analysable with dotnet-dump analyze, Visual Studio, WinDbg, LLDB+SOS

πŸ“— Node.js

eBPF Profiling - Two options available (recommended):

  1. BPF (default) - BCC-based profiler

    • Requires kernel headers (/lib/modules)
    • Usage: No --tool flag needed (default)
  2. BTF - CO-RE eBPF profiler

    • No kernel headers required - only needs BTF
    • Usage: Add --tool btf flag
    • Example: kubectl prof my-pod -t 1m -l node --tool btf

Alternative: perf

  • Available for fallback if eBPF profiling unavailable

Output formats:

  • FlameGraphs: -o flamegraph (default)
  • Raw output: -o raw
  • Heap snapshot: -o heapsnapshot

πŸ’‘ Tip: For JavaScript symbol resolution, run Node.js with --perf-basic-prof flag
πŸ’‘ Tip: For heap snapshots, run Node.js with --heapsnapshot-signal flag

βš™οΈ Clang/Clang++

eBPF Profiling - Two options available (recommended):

  1. BPF (default) - BCC-based profiler

    • Requires kernel headers (/lib/modules)
    • Usage: No --tool flag needed (default)
  2. BTF - CO-RE eBPF profiler

    • No kernel headers required - only needs BTF
    • Usage: Add --tool btf flag
    • Example: kubectl prof my-pod -t 1m -l clang --tool btf

Alternative: perf

  • Available for fallback if eBPF profiling unavailable

Output formats:

  • FlameGraphs: -o flamegraph
  • Raw output: -o raw

πŸ“Š Raw Output Format

The raw output is a text file containing profiling data that can be:


πŸ”„ Profiling Modes

Discrete Mode (default)

  • Single profiling session
  • Result available when profiling completes
  • Usage: -t 5m

Continuous Mode

  • Multiple results at regular intervals
  • Only the last result is available by default
  • Client responsible for storing all results
  • Usage: -t 5m --interval 60s

🎯 Process Targeting

By default, kubectl-prof profiles all processes in the target container matching the specified language.

Warning example:

⚠ Detected more than one PID to profile: [2508 2509]. 
  It will attempt to profile all of them. 
  Use the --pid flag to profile a specific PID.

Target a specific process:

  • By PID: --pid 1234
  • By name: --pgrep process-name

πŸ” Capabilities

For Java profiling, kubectl-prof uses PERFMON and SYSLOG capabilities by default.

According to the Kernel documentation, these capabilities should be sufficient for collecting performance samples.

To use SYS_ADMIN instead:

kubectl prof my-pod -t 5m -l java --capabilities=SYS_ADMIN

Add multiple capabilities:

kubectl prof my-pod -t 5m -l java \
  --capabilities=SYS_ADMIN \
  --capabilities=PERFMON

🏷️ Node Tolerations

By default, the profiling agent pod is scheduled only on nodes without taints. For nodes with taints, specify tolerations:

Toleration formats:

  • key=value:effect - Full specification
  • key:effect - Any value
  • key - Defaults to NoSchedule

Examples:

# Single toleration
kubectl prof my-pod -t 5m -l java \
  --tolerations=node.kubernetes.io/disk-pressure=true:NoSchedule

# Multiple tolerations
kubectl prof my-pod -t 5m -l java \
  --tolerations=node.kubernetes.io/disk-pressure=true:NoSchedule \
  --tolerations=node.kubernetes.io/memory-pressure:NoExecute \
  --tolerations=dedicated=profiling:PreferNoSchedule

🀝 Contributing

We welcome contributions! Please refer to Contributing.md for information about how to get involved.

We welcome:

  • πŸ› Bug reports
  • πŸ’‘ Feature requests
  • πŸ“ Documentation improvements
  • πŸ”§ Pull requests

πŸ‘₯ Maintainers

Special Thanks πŸ™

Original author of kubectl-flame:


πŸ“„ License

This project is licensed under the terms of the Apache 2.0 open source license. Please refer to LICENSE for the full terms.

About

kubectl-prof is a kubectl plugin to profile applications on kubernetes with minimum overhead

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors