Skip to content

eBPF BTF (BPF Type Format) Programming Guide

1. Introduction to BTF

What is BTF?

BTF (BPF Type Format) is a type metadata format provided by the Linux kernel, used to describe type information for eBPF programs and kernel data structures.

Core Advantages of BTF

  • Compile Once, Run Everywhere (CO-RE): No need to recompile on target machines
  • Kernel Structure Access: Safely read kernel data structures
  • Type Safety: Compile-time type compatibility checking
  • Debug Friendly: Provides rich type information

Problems Solved by BTF

Before BTF, eBPF programs faced the following problems:

task_struct Structure Example (Simplified)

task_struct is the core data structure in the Linux kernel that describes processes. Its size and layout can differ across kernel versions.

Example 1: task_struct in Linux 5.10 Kernel (Simplified)

c
struct task_struct {
    struct thread_info thread_info;    // Offset: 0    (Size: 16 bytes)
    unsigned int __state;               // Offset: 16   (Size: 4 bytes)
    void *stack;                        // Offset: 24   (Size: 8 bytes)
    refcount_t usage;                   // Offset: 32   (Size: 4 bytes)
    unsigned int flags;                 // Offset: 36   (Size: 4 bytes)
    // ... hundreds of bytes of other fields omitted ...

    pid_t pid;                          // Offset: 1232 (Size: 4 bytes) ⬅️ Here!
    pid_t tgid;                         // Offset: 1236 (Size: 4 bytes)

    struct task_struct *real_parent;   // Offset: 1256 (Size: 8 bytes)
    struct task_struct *parent;         // Offset: 1264 (Size: 8 bytes)

    char comm[16];                      // Offset: 1784 (Size: 16 bytes)
    struct mm_struct *mm;               // Offset: 1848 (Size: 8 bytes)
    // ... more fields ...
};

Example 2: task_struct in Linux 6.1 Kernel (Simplified)

c
struct task_struct {
    struct thread_info thread_info;    // Offset: 0    (Size: 16 bytes)
    unsigned int __state;               // Offset: 16   (Size: 4 bytes)
    void *stack;                        // Offset: 24   (Size: 8 bytes)
    refcount_t usage;                   // Offset: 32   (Size: 4 bytes)
    unsigned int flags;                 // Offset: 36   (Size: 4 bytes)

    // ⚠️ Version 6.1 added some security-related fields
    unsigned int ptrace;                // Offset: 40   (New!)
    int on_rq;                          // Offset: 44   (New!)
    // ... other fields omitted ...

    pid_t pid;                          // Offset: 1368 (Size: 4 bytes) ⬅️ Offset changed!
    pid_t tgid;                         // Offset: 1372 (Size: 4 bytes)

    struct task_struct *real_parent;   // Offset: 1392 (Size: 8 bytes) ⬅️ Also changed!
    struct task_struct *parent;         // Offset: 1400 (Size: 8 bytes)

    char comm[16];                      // Offset: 1920 (Size: 16 bytes) ⬅️ Also changed!
    struct mm_struct *mm;               // Offset: 1984 (Size: 8 bytes)
    // ... more fields ...
};

Offset Calculation Example

Suppose we want to read the pid field:

c
// ❌ Wrong way: Hard-coded offsets
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
int pid;

// On Linux 5.10
bpf_probe_read(&pid, sizeof(pid), (void *)task + 1232);  // pid at offset 1232

// But on Linux 6.1, the same code reads the wrong location!
bpf_probe_read(&pid, sizeof(pid), (void *)task + 1232);  // ❌ Should be 1368!

BTF's Solution

c
// BTF + CO-RE approach - Automatically handles offsets
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t pid = BPF_CORE_READ(task, pid);  // ✅ Automatic adaptation!

Advantages:

  • ✅ Compiler automatically calculates correct offsets
  • ✅ Runtime adaptation to different kernel versions
  • ✅ Type-safe access method

2. BTF Core Concepts

2.1 vmlinux.h

vmlinux.h is a header file containing all kernel data structure definitions, generated by bpftool from BTF information.

Generating vmlinux.h

bash
# Generate from current kernel
bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

# Check if kernel supports BTF
ls /sys/kernel/btf/vmlinux

Advantages of vmlinux.h

c
// Traditional way - Need to include multiple header files
#include <linux/sched.h>
#include <linux/fs.h>
#include <linux/mm.h>
// ... potentially dozens of header files

// BTF way - Only one header file needed
#include "vmlinux.h"  // ✅ Contains all kernel definitions

2.2 BPF_CORE_READ Macro

BPF_CORE_READ is the core macro of CO-RE, used to safely read kernel structure fields.

Syntax Format

c
// Basic usage
BPF_CORE_READ(ptr, field)
// Single-level access equivalent to traditional pointer access
ptr->field

// Multi-level nested access
BPF_CORE_READ(ptr, field1, field2, field3)

// Multi-level nested access equivalent to traditional pointer access
ptr->field1->field2->field3

Usage Examples

c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();

// Read single field
pid_t pid = BPF_CORE_READ(task, pid);

// Read nested field
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

// Equivalent to
// task->real_parent->pid

2.3 BPF_CORE_READ_INTO() Macro

BPF_CORE_READ_INTO (Read to Variable)

c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t ppid;

// Read value into specified variable
BPF_CORE_READ_INTO(&ppid, task, real_parent, pid);

2.4 BPF_CORE_READ_STR_INTO() Macro

BPF_CORE_READ_STR_INTO (Read String)

c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
char comm[16];

// Read process name
BPF_CORE_READ_STR_INTO(comm, task, comm);

2.5 bpf_probe_read vs bpf_core_read vs BPF_CORE_READ Detailed Explanation

These three are different ways to read memory data in eBPF and are easily confused. Let's compare them in detail:

Core Differences Overview

Featurebpf_probe_readbpf_core_readBPF_CORE_READ
TypeHelper functionHelper functionMacro
Definition LocationKernelKernel (inline function)libbpf header file
CO-RE Support❌ No✅ Yes✅ Yes
Type Safety❌ Weak (void *)✅ Strong✅ Strong
Use CaseRead arbitrary memoryRead single fieldRead nested fields
Recommendation⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐

1. bpf_probe_read - Traditional Memory Read Function

Function Prototype:

c
long bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr);

Characteristics:

  • Lowest-level memory read function
  • Requires manual size specification
  • No type checking
  • Does not support CO-RE

Usage Example:

c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

// Read real_parent pointer
bpf_probe_read(&parent, sizeof(parent), &task->real_parent);

// Read parent->pid
bpf_probe_read(&ppid, sizeof(ppid), &parent->pid);

Problems:

  • ❌ Need to know exact field offsets
  • ❌ Nested access requires multiple calls
  • ❌ No CO-RE, cannot work across kernel versions
  • ❌ Verbose code

Applicable Scenarios:

  • Reading arbitrary memory addresses (like user-space addresses)
  • Scenarios unrelated to BTF/CO-RE
  • Need precise control over read behavior

2. bpf_core_read - CO-RE Inline Function

Function Prototype:

c
static __always_inline int bpf_core_read(void *dst, int sz, const void *src);

Characteristics:

  • Inline function provided by kernel
  • Supports CO-RE relocation
  • Requires manual size specification
  • Can only read single fields

Usage Example:

c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t pid;

// Read single field - Correct usage
bpf_core_read(&pid, sizeof(pid), &task->pid);  // ✅

// Read nested field - Wrong usage!
// bpf_core_read(&ppid, sizeof(ppid), &task->real_parent->pid);  // ❌ Compilation error!

Limitations:

  • ⚠️ Cannot directly access nested fields (like task->real_parent->pid)
  • ⚠️ Need to manually specify size
  • ⚠️ Still relatively verbose

Correct Nested Access Method:

c
// Need to read in two steps
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

// Step 1: Read parent pointer
bpf_core_read(&parent, sizeof(parent), &task->real_parent);

// Step 2: Read parent->pid
bpf_core_read(&ppid, sizeof(ppid), &parent->pid);

Applicable Scenarios:

  • Reading single simple fields
  • Need CO-RE but don't want to use macros
  • Scenarios with extreme performance requirements

Macro Definition (Simplified):

c
#define BPF_CORE_READ(src, a, ...)  \
({  \
    /* Record access path at compile time */  \
    /* Generate CO-RE relocation information */  \
    /* Return read value */  \
})

Characteristics:

  • Macro provided by libbpf
  • Full CO-RE support
  • Supports nested field access
  • Automatically infers type and size
  • Most concise code

Usage Example:

c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();

// Read single field
pid_t pid = BPF_CORE_READ(task, pid);

// Read nested field - One line! ✅
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

Advantages:

  • ✅ Most concise code (nested access in one line)
  • ✅ Type safe (compile-time checking)
  • ✅ Automatic offset handling
  • ✅ Full CO-RE support

Applicable Scenarios:

  • Reading kernel structure fields (Recommended!)
  • Need CO-RE support
  • Want concise, readable code

Practical Comparison: Reading Parent Process PID

Scenario: Read task->real_parent->pid

c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

// Need 3 steps, 8 lines of code
bpf_probe_read(&parent, sizeof(parent),
               (void *)task + offsetof(struct task_struct, real_parent));
bpf_probe_read(&ppid, sizeof(ppid),
               (void *)parent + offsetof(struct task_struct, pid));

// ❌ Problems:
// 1. Need to know offsetof, but offsetof may be inaccurate in eBPF
// 2. No CO-RE, cannot work across kernel versions
// 3. Verbose code, error-prone
Method 2: bpf_core_read (Usable, but Verbose)
c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

// Need 2 steps, 5 lines of code
bpf_core_read(&parent, sizeof(parent), &task->real_parent);  // ✅ CO-RE
bpf_core_read(&ppid, sizeof(ppid), &parent->pid);            // ✅ CO-RE

// ⚠️ Drawbacks:
// 1. Need intermediate variable parent
// 2. Need two function calls
// 3. Manual size specification
c
struct task_struct *task = (struct task_struct *)bpf_get_current_task();

// Only need 1 line! ✅
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

// ✅ Advantages:
// 1. Concise and clear code
// 2. Full CO-RE support
// 3. Automatic type and size handling
// 4. Nested access in one call

Common Misconceptions

Misconception 1: Confusing bpf_core_read Function and BPF_CORE_READ Macro
c
// ❌ Wrong: Using macro as function
bpf_core_read(&ppid, sizeof(ppid), task->real_parent->pid);  // Compilation error!

// ✅ Correct: Use macro
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);
Misconception 2: Directly Accessing Nested Fields in bpf_core_read
c
// ❌ Wrong: bpf_core_read doesn't support nested access
pid_t ppid;
bpf_core_read(&ppid, sizeof(ppid), &task->real_parent->pid);  // ❌

// ✅ Correct: Use BPF_CORE_READ macro
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);  // ✅
Misconception 3: Using BPF_CORE_READ Where bpf_probe_read_user Should Be Used
c
// ❌ Wrong: BPF_CORE_READ is for kernel structures, cannot read user-space memory
char *user_str = "user space string";
char buf[64];
// BPF_CORE_READ(buf, user_str);  // ❌ Wrong!

// ✅ Correct: Use bpf_probe_read_user_str to read user-space strings
bpf_probe_read_user_str(buf, sizeof(buf), user_str);  // ✅

Selection Guide

Decision Tree:

Need to read memory data

  ├─ Reading user-space memory?
  │   └─ Yes → Use bpf_probe_read_user / bpf_probe_read_user_str

  └─ Reading kernel structures?

      ├─ Need CO-RE support?
      │   ├─ No → Use bpf_probe_read (not recommended unless special reason)
      │   └─ Yes ↓

      ├─ Accessing nested fields?
      │   ├─ Yes → Use BPF_CORE_READ macro ⭐⭐⭐⭐⭐ (Recommended!)
      │   └─ No  → Use bpf_core_read or BPF_CORE_READ

      └─ Conclusion: Default to BPF_CORE_READ macro!

Best Practice Recommendations

  1. Prefer BPF_CORE_READ macro
  2. Avoid using bpf_probe_read to read kernel structures
    • Only use for reading user-space memory
    • Or in scenarios that completely don't need CO-RE
  3. bpf_core_read function has few use cases
    • Only use when special control is needed
    • In most cases, BPF_CORE_READ macro is sufficient

4. Common Incorrect Usage Comparison

Error Example 1: Direct Pointer Access

c
// ❌ Wrong: Direct access (will cause verifier failure)
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t ppid = task->real_parent->pid;  // Verifier error!

Error Reason:

  • eBPF verifier cannot verify pointer validity
  • Different kernel versions have different offsets

Error Example 2: Using bpf_probe_read

c
// ❌ Not recommended: Using bpf_probe_read (can work, but not best practice)
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

bpf_probe_read(&parent, sizeof(parent), &task->real_parent);
bpf_probe_read(&ppid, sizeof(ppid), &parent->pid);

Problems:

  • Verbose code
  • No CO-RE portability
  • Need to manually handle each level of pointer

Correct Example

c
// ✅ Correct: Use BPF_CORE_READ
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

// ✅ Better: Use bpf_get_current_task_btf()
struct task_struct *task = (struct task_struct *)bpf_get_current_task_btf();
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

2.6 bpf_get_current_task_btf() Function

This is a helper function that returns a BTF-typed pointer, safer than bpf_get_current_task().

Comparison of Two Ways to Get task_struct

MethodFunctionReturn TypeType SafetyRecommendation
Traditional waybpf_get_current_task()void * (requires casting)❌ WeakNot recommended
BTF waybpf_get_current_task_btf()struct task_struct *✅ StrongRecommended

Usage Example

c
// Method 1: Traditional way
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

// Method 2: BTF way (Recommended)
struct task_struct *task = (struct task_struct *)bpf_get_current_task_btf();
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

Key Differences:

  • bpf_get_current_task_btf() returns a pointer carrying BTF type information
  • eBPF verifier can perform stricter type checking
  • Better error messages and debugging experience

3. Practical Example: Monitoring open System Call

Complete eBPF Kernel Program

File: btf.bpf.c

c
// Example code would go here

User-Space Program

File: btf.c

c
// Example code would go here

4. Common Questions

Q1: What's the relationship between BTF and CO-RE?

Answer:

  • BTF: Type metadata format (data format)
  • CO-RE: Compile Once, Run Everywhere technology (application using BTF)
  • Relationship: CO-RE depends on type information provided by BTF

Q2: Do all kernels support BTF?

Answer: No, the following conditions must be met:

  • Linux kernel >= 5.2 (BTF support)
  • Kernel compiled with CONFIG_DEBUG_INFO_BTF=y enabled
  • Check method: ls /sys/kernel/btf/vmlinux

Q3: What's the difference between bpf_probe_read, bpf_core_read, and BPF_CORE_READ?

Answer: These three are different ways to read memory data in eBPF. For detailed comparison, please refer to Section 2.5.

Quick Summary:

Featurebpf_probe_readbpf_core_readBPF_CORE_READ
TypeFunctionFunctionMacro
CO-RE Support❌ No✅ Yes✅ Yes
Nested Access❌ Need multiple calls❌ Need multiple calls✅ One line
Type Safety❌ Weak✅ Strong✅ Strong
Recommendation⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐

Decision Guide:

  • 🥇 Prefer BPF_CORE_READ macro: Read kernel structure fields (especially nested fields)
  • 🥈 Occasionally use bpf_core_read function: Single field and need special control
  • 🥉 Avoid bpf_probe_read: Only for reading user-space memory or scenarios that completely don't need CO-RE

Example:

c
// ⭐⭐⭐⭐⭐ Recommended: BPF_CORE_READ macro
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);  // One line!

// ⭐⭐⭐ Usable: bpf_core_read function
bpf_core_read(&parent, sizeof(parent), &task->real_parent);
bpf_core_read(&ppid, sizeof(ppid), &parent->pid);  // Need two steps

// ⭐⭐ Not recommended: bpf_probe_read
bpf_probe_read(&parent, sizeof(parent), &task->real_parent);
bpf_probe_read(&ppid, sizeof(ppid), &parent->pid);  // No CO-RE

Q4: Why sometimes use bpf_get_current_task(), and sometimes bpf_get_current_task_btf()?

Answer:

FunctionReturn TypeKernel RequirementRecommendation
bpf_get_current_task()unsigned long (requires casting)All versionsHigh compatibility
bpf_get_current_task_btf()struct task_struct *>= 5.14Type safe

Recommendation:

  • If only supporting new kernels (>= 5.14): Use bpf_get_current_task_btf()
  • If need to support old kernels: Use bpf_get_current_task() + casting

Released under the MIT License.