eBPF BTF (BPF Type Format) Programming Guide

1. Introduction to BTF

What is BTF?

BTF (BPF Type Format) is a type metadata format provided by the Linux kernel, used to describe type information for eBPF programs and kernel data structures.

Core Advantages of BTF

✅ Compile Once, Run Everywhere (CO-RE): No need to recompile on target machines
✅ Kernel Structure Access: Safely read kernel data structures
✅ Type Safety: Compile-time type compatibility checking
✅ Debug Friendly: Provides rich type information

Problems Solved by BTF

Before BTF, eBPF programs faced the following problems:

task_struct Structure Example (Simplified)

task_struct is the core data structure in the Linux kernel that describes processes. Its size and layout can differ across kernel versions.

Example 1: task_struct in Linux 5.10 Kernel (Simplified)

struct task_struct {
    struct thread_info thread_info;    // Offset: 0    (Size: 16 bytes)
    unsigned int __state;               // Offset: 16   (Size: 4 bytes)
    void *stack;                        // Offset: 24   (Size: 8 bytes)
    refcount_t usage;                   // Offset: 32   (Size: 4 bytes)
    unsigned int flags;                 // Offset: 36   (Size: 4 bytes)
    // ... hundreds of bytes of other fields omitted ...

    pid_t pid;                          // Offset: 1232 (Size: 4 bytes) ⬅️ Here!
    pid_t tgid;                         // Offset: 1236 (Size: 4 bytes)

    struct task_struct *real_parent;   // Offset: 1256 (Size: 8 bytes)
    struct task_struct *parent;         // Offset: 1264 (Size: 8 bytes)

    char comm[16];                      // Offset: 1784 (Size: 16 bytes)
    struct mm_struct *mm;               // Offset: 1848 (Size: 8 bytes)
    // ... more fields ...
};

Example 2: task_struct in Linux 6.1 Kernel (Simplified)

struct task_struct {
    struct thread_info thread_info;    // Offset: 0    (Size: 16 bytes)
    unsigned int __state;               // Offset: 16   (Size: 4 bytes)
    void *stack;                        // Offset: 24   (Size: 8 bytes)
    refcount_t usage;                   // Offset: 32   (Size: 4 bytes)
    unsigned int flags;                 // Offset: 36   (Size: 4 bytes)

    // ⚠️ Version 6.1 added some security-related fields
    unsigned int ptrace;                // Offset: 40   (New!)
    int on_rq;                          // Offset: 44   (New!)
    // ... other fields omitted ...

    pid_t pid;                          // Offset: 1368 (Size: 4 bytes) ⬅️ Offset changed!
    pid_t tgid;                         // Offset: 1372 (Size: 4 bytes)

    struct task_struct *real_parent;   // Offset: 1392 (Size: 8 bytes) ⬅️ Also changed!
    struct task_struct *parent;         // Offset: 1400 (Size: 8 bytes)

    char comm[16];                      // Offset: 1920 (Size: 16 bytes) ⬅️ Also changed!
    struct mm_struct *mm;               // Offset: 1984 (Size: 8 bytes)
    // ... more fields ...
};

Offset Calculation Example

Suppose we want to read the pid field:

// ❌ Wrong way: Hard-coded offsets
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
int pid;

// On Linux 5.10
bpf_probe_read(&pid, sizeof(pid), (void *)task + 1232);  // pid at offset 1232

// But on Linux 6.1, the same code reads the wrong location!
bpf_probe_read(&pid, sizeof(pid), (void *)task + 1232);  // ❌ Should be 1368!

BTF's Solution

// BTF + CO-RE approach - Automatically handles offsets
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t pid = BPF_CORE_READ(task, pid);  // ✅ Automatic adaptation!

Advantages:

✅ Compiler automatically calculates correct offsets
✅ Runtime adaptation to different kernel versions
✅ Type-safe access method

2. BTF Core Concepts

2.1 vmlinux.h

vmlinux.h is a header file containing all kernel data structure definitions, generated by bpftool from BTF information.

Generating vmlinux.h

bash

# Generate from current kernel
bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

# Check if kernel supports BTF
ls /sys/kernel/btf/vmlinux

Advantages of vmlinux.h

// Traditional way - Need to include multiple header files
#include <linux/sched.h>
#include <linux/fs.h>
#include <linux/mm.h>
// ... potentially dozens of header files

// BTF way - Only one header file needed
#include "vmlinux.h"  // ✅ Contains all kernel definitions

2.2 BPF_CORE_READ Macro

BPF_CORE_READ is the core macro of CO-RE, used to safely read kernel structure fields.

Syntax Format

// Basic usage
BPF_CORE_READ(ptr, field)
// Single-level access equivalent to traditional pointer access
ptr->field

// Multi-level nested access
BPF_CORE_READ(ptr, field1, field2, field3)

// Multi-level nested access equivalent to traditional pointer access
ptr->field1->field2->field3

Usage Examples

struct task_struct *task = (struct task_struct *)bpf_get_current_task();

// Read single field
pid_t pid = BPF_CORE_READ(task, pid);

// Read nested field
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

// Equivalent to
// task->real_parent->pid

2.3 BPF_CORE_READ_INTO() Macro

BPF_CORE_READ_INTO (Read to Variable)

struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t ppid;

// Read value into specified variable
BPF_CORE_READ_INTO(&ppid, task, real_parent, pid);

2.4 BPF_CORE_READ_STR_INTO() Macro

BPF_CORE_READ_STR_INTO (Read String)

struct task_struct *task = (struct task_struct *)bpf_get_current_task();
char comm[16];

// Read process name
BPF_CORE_READ_STR_INTO(comm, task, comm);

2.5 bpf_probe_read vs bpf_core_read vs BPF_CORE_READ Detailed Explanation

These three are different ways to read memory data in eBPF and are easily confused. Let's compare them in detail:

Core Differences Overview

Feature	bpf_probe_read	bpf_core_read	BPF_CORE_READ
Type	Helper function	Helper function	Macro
Definition Location	Kernel	Kernel (inline function)	libbpf header file
CO-RE Support	❌ No	✅ Yes	✅ Yes
Type Safety	❌ Weak (void *)	✅ Strong	✅ Strong
Use Case	Read arbitrary memory	Read single field	Read nested fields
Recommendation	⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐

1. bpf_probe_read - Traditional Memory Read Function

Function Prototype:

long bpf_probe_read(void *dst, u32 size, const void *unsafe_ptr);

Characteristics:

Lowest-level memory read function
Requires manual size specification
No type checking
Does not support CO-RE

Usage Example:

struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

// Read real_parent pointer
bpf_probe_read(&parent, sizeof(parent), &task->real_parent);

// Read parent->pid
bpf_probe_read(&ppid, sizeof(ppid), &parent->pid);

Problems:

❌ Need to know exact field offsets
❌ Nested access requires multiple calls
❌ No CO-RE, cannot work across kernel versions
❌ Verbose code

Applicable Scenarios:

Reading arbitrary memory addresses (like user-space addresses)
Scenarios unrelated to BTF/CO-RE
Need precise control over read behavior

2. bpf_core_read - CO-RE Inline Function

Function Prototype:

static __always_inline int bpf_core_read(void *dst, int sz, const void *src);

Characteristics:

Inline function provided by kernel
Supports CO-RE relocation
Requires manual size specification
Can only read single fields

Usage Example:

struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t pid;

// Read single field - Correct usage
bpf_core_read(&pid, sizeof(pid), &task->pid);  // ✅

// Read nested field - Wrong usage!
// bpf_core_read(&ppid, sizeof(ppid), &task->real_parent->pid);  // ❌ Compilation error!

Limitations:

⚠️ Cannot directly access nested fields (like task->real_parent->pid)
⚠️ Need to manually specify size
⚠️ Still relatively verbose

Correct Nested Access Method:

// Need to read in two steps
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

// Step 1: Read parent pointer
bpf_core_read(&parent, sizeof(parent), &task->real_parent);

// Step 2: Read parent->pid
bpf_core_read(&ppid, sizeof(ppid), &parent->pid);

Applicable Scenarios:

Reading single simple fields
Need CO-RE but don't want to use macros
Scenarios with extreme performance requirements

3. BPF_CORE_READ - Recommended CO-RE Macro ⭐⭐⭐⭐⭐

Macro Definition (Simplified):

#define BPF_CORE_READ(src, a, ...)  \
({  \
    /* Record access path at compile time */  \
    /* Generate CO-RE relocation information */  \
    /* Return read value */  \
})

Characteristics:

Macro provided by libbpf
Full CO-RE support
Supports nested field access
Automatically infers type and size
Most concise code

Usage Example:

struct task_struct *task = (struct task_struct *)bpf_get_current_task();

// Read single field
pid_t pid = BPF_CORE_READ(task, pid);

// Read nested field - One line! ✅
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

Advantages:

✅ Most concise code (nested access in one line)
✅ Type safe (compile-time checking)
✅ Automatic offset handling
✅ Full CO-RE support

Applicable Scenarios:

Reading kernel structure fields (Recommended!)
Need CO-RE support
Want concise, readable code

Practical Comparison: Reading Parent Process PID

Scenario: Read task->real_parent->pid

Method 1: bpf_probe_read (Not Recommended)

struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

// Need 3 steps, 8 lines of code
bpf_probe_read(&parent, sizeof(parent),
               (void *)task + offsetof(struct task_struct, real_parent));
bpf_probe_read(&ppid, sizeof(ppid),
               (void *)parent + offsetof(struct task_struct, pid));

// ❌ Problems:
// 1. Need to know offsetof, but offsetof may be inaccurate in eBPF
// 2. No CO-RE, cannot work across kernel versions
// 3. Verbose code, error-prone

Method 2: bpf_core_read (Usable, but Verbose)

struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

// Need 2 steps, 5 lines of code
bpf_core_read(&parent, sizeof(parent), &task->real_parent);  // ✅ CO-RE
bpf_core_read(&ppid, sizeof(ppid), &parent->pid);            // ✅ CO-RE

// ⚠️ Drawbacks:
// 1. Need intermediate variable parent
// 2. Need two function calls
// 3. Manual size specification

Method 3: BPF_CORE_READ (Recommended!) ⭐⭐⭐⭐⭐

struct task_struct *task = (struct task_struct *)bpf_get_current_task();

// Only need 1 line! ✅
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

// ✅ Advantages:
// 1. Concise and clear code
// 2. Full CO-RE support
// 3. Automatic type and size handling
// 4. Nested access in one call

Common Misconceptions

Misconception 1: Confusing bpf_core_read Function and BPF_CORE_READ Macro

// ❌ Wrong: Using macro as function
bpf_core_read(&ppid, sizeof(ppid), task->real_parent->pid);  // Compilation error!

// ✅ Correct: Use macro
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

Misconception 2: Directly Accessing Nested Fields in bpf_core_read

// ❌ Wrong: bpf_core_read doesn't support nested access
pid_t ppid;
bpf_core_read(&ppid, sizeof(ppid), &task->real_parent->pid);  // ❌

// ✅ Correct: Use BPF_CORE_READ macro
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);  // ✅

Misconception 3: Using BPF_CORE_READ Where bpf_probe_read_user Should Be Used

// ❌ Wrong: BPF_CORE_READ is for kernel structures, cannot read user-space memory
char *user_str = "user space string";
char buf[64];
// BPF_CORE_READ(buf, user_str);  // ❌ Wrong!

// ✅ Correct: Use bpf_probe_read_user_str to read user-space strings
bpf_probe_read_user_str(buf, sizeof(buf), user_str);  // ✅

Selection Guide

Decision Tree:

Need to read memory data
  │
  ├─ Reading user-space memory?
  │   └─ Yes → Use bpf_probe_read_user / bpf_probe_read_user_str
  │
  └─ Reading kernel structures?
      │
      ├─ Need CO-RE support?
      │   ├─ No → Use bpf_probe_read (not recommended unless special reason)
      │   └─ Yes ↓
      │
      ├─ Accessing nested fields?
      │   ├─ Yes → Use BPF_CORE_READ macro ⭐⭐⭐⭐⭐ (Recommended!)
      │   └─ No  → Use bpf_core_read or BPF_CORE_READ
      │
      └─ Conclusion: Default to BPF_CORE_READ macro!

Best Practice Recommendations

Prefer BPF_CORE_READ macro
Avoid using bpf_probe_read to read kernel structures
- Only use for reading user-space memory
- Or in scenarios that completely don't need CO-RE
bpf_core_read function has few use cases
- Only use when special control is needed
- In most cases, BPF_CORE_READ macro is sufficient

4. Common Incorrect Usage Comparison

Error Example 1: Direct Pointer Access

// ❌ Wrong: Direct access (will cause verifier failure)
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t ppid = task->real_parent->pid;  // Verifier error!

Error Reason:

eBPF verifier cannot verify pointer validity
Different kernel versions have different offsets

Error Example 2: Using bpf_probe_read

// ❌ Not recommended: Using bpf_probe_read (can work, but not best practice)
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
struct task_struct *parent;
pid_t ppid;

bpf_probe_read(&parent, sizeof(parent), &task->real_parent);
bpf_probe_read(&ppid, sizeof(ppid), &parent->pid);

Problems:

Verbose code
No CO-RE portability
Need to manually handle each level of pointer

Correct Example

// ✅ Correct: Use BPF_CORE_READ
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

// ✅ Better: Use bpf_get_current_task_btf()
struct task_struct *task = (struct task_struct *)bpf_get_current_task_btf();
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

2.6 bpf_get_current_task_btf() Function

This is a helper function that returns a BTF-typed pointer, safer than bpf_get_current_task().

Comparison of Two Ways to Get task_struct

Method	Function	Return Type	Type Safety	Recommendation
Traditional way	`bpf_get_current_task()`	`void *` (requires casting)	❌ Weak	Not recommended
BTF way	`bpf_get_current_task_btf()`	`struct task_struct *`	✅ Strong	Recommended

Usage Example

// Method 1: Traditional way
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

// Method 2: BTF way (Recommended)
struct task_struct *task = (struct task_struct *)bpf_get_current_task_btf();
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);

Key Differences:

bpf_get_current_task_btf() returns a pointer carrying BTF type information
eBPF verifier can perform stricter type checking
Better error messages and debugging experience

3. Practical Example: Monitoring open System Call

Complete eBPF Kernel Program

File: btf.bpf.c

// Example code would go here

User-Space Program

File: btf.c

// Example code would go here

4. Common Questions

Q1: What's the relationship between BTF and CO-RE?

Answer:

BTF: Type metadata format (data format)
CO-RE: Compile Once, Run Everywhere technology (application using BTF)
Relationship: CO-RE depends on type information provided by BTF

Q2: Do all kernels support BTF?

Answer: No, the following conditions must be met:

Linux kernel >= 5.2 (BTF support)
Kernel compiled with CONFIG_DEBUG_INFO_BTF=y enabled
Check method: ls /sys/kernel/btf/vmlinux

Q3: What's the difference between bpf_probe_read, bpf_core_read, and BPF_CORE_READ?

Answer: These three are different ways to read memory data in eBPF. For detailed comparison, please refer to Section 2.5.

Quick Summary:

Feature	bpf_probe_read	bpf_core_read	BPF_CORE_READ
Type	Function	Function	Macro
CO-RE Support	❌ No	✅ Yes	✅ Yes
Nested Access	❌ Need multiple calls	❌ Need multiple calls	✅ One line
Type Safety	❌ Weak	✅ Strong	✅ Strong
Recommendation	⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐

Decision Guide:

🥇 Prefer BPF_CORE_READ macro: Read kernel structure fields (especially nested fields)
🥈 Occasionally use bpf_core_read function: Single field and need special control
🥉 Avoid bpf_probe_read: Only for reading user-space memory or scenarios that completely don't need CO-RE

Example:

// ⭐⭐⭐⭐⭐ Recommended: BPF_CORE_READ macro
pid_t ppid = BPF_CORE_READ(task, real_parent, pid);  // One line!

// ⭐⭐⭐ Usable: bpf_core_read function
bpf_core_read(&parent, sizeof(parent), &task->real_parent);
bpf_core_read(&ppid, sizeof(ppid), &parent->pid);  // Need two steps

// ⭐⭐ Not recommended: bpf_probe_read
bpf_probe_read(&parent, sizeof(parent), &task->real_parent);
bpf_probe_read(&ppid, sizeof(ppid), &parent->pid);  // No CO-RE

Q4: Why sometimes use bpf_get_current_task(), and sometimes bpf_get_current_task_btf()?

Answer:

Function	Return Type	Kernel Requirement	Recommendation
`bpf_get_current_task()`	`unsigned long` (requires casting)	All versions	High compatibility
`bpf_get_current_task_btf()`	`struct task_struct *`	>= 5.14	Type safe

Recommendation:

If only supporting new kernels (>= 5.14): Use bpf_get_current_task_btf()
If need to support old kernels: Use bpf_get_current_task() + casting

eBPF BTF (BPF Type Format) Programming Guide ​

1. Introduction to BTF ​

What is BTF? ​

Core Advantages of BTF ​

Problems Solved by BTF ​

task_struct Structure Example (Simplified) ​

Offset Calculation Example ​

BTF's Solution ​

2. BTF Core Concepts ​

2.1 vmlinux.h ​

Generating vmlinux.h ​

Advantages of vmlinux.h ​

2.2 BPF_CORE_READ Macro ​

Syntax Format ​

Usage Examples ​

2.3 BPF_CORE_READ_INTO() Macro ​

BPF_CORE_READ_INTO (Read to Variable) ​

2.4 BPF_CORE_READ_STR_INTO() Macro ​

BPF_CORE_READ_STR_INTO (Read String) ​

2.5 bpf_probe_read vs bpf_core_read vs BPF_CORE_READ Detailed Explanation ​

Core Differences Overview ​

1. bpf_probe_read - Traditional Memory Read Function ​

2. bpf_core_read - CO-RE Inline Function ​

3. BPF_CORE_READ - Recommended CO-RE Macro ⭐⭐⭐⭐⭐ ​

Practical Comparison: Reading Parent Process PID ​

Method 1: bpf_probe_read (Not Recommended) ​

Method 2: bpf_core_read (Usable, but Verbose) ​

Method 3: BPF_CORE_READ (Recommended!) ⭐⭐⭐⭐⭐ ​

Common Misconceptions ​

Misconception 1: Confusing bpf_core_read Function and BPF_CORE_READ Macro ​

Misconception 2: Directly Accessing Nested Fields in bpf_core_read ​

Misconception 3: Using BPF_CORE_READ Where bpf_probe_read_user Should Be Used ​

Selection Guide ​

Best Practice Recommendations ​

4. Common Incorrect Usage Comparison ​

Error Example 1: Direct Pointer Access ​

Error Example 2: Using bpf_probe_read ​

Correct Example ​

2.6 bpf_get_current_task_btf() Function ​

Comparison of Two Ways to Get task_struct ​

Usage Example ​

3. Practical Example: Monitoring open System Call ​

Complete eBPF Kernel Program ​

User-Space Program ​

4. Common Questions ​

Q1: What's the relationship between BTF and CO-RE? ​

Q2: Do all kernels support BTF? ​

Q3: What's the difference between bpf_probe_read, bpf_core_read, and BPF_CORE_READ? ​

Q4: Why sometimes use bpf_get_current_task(), and sometimes bpf_get_current_task_btf()? ​

eBPF BTF (BPF Type Format) Programming Guide

1. Introduction to BTF

What is BTF?

Core Advantages of BTF

Problems Solved by BTF

task_struct Structure Example (Simplified)

Offset Calculation Example

BTF's Solution

2. BTF Core Concepts

2.1 vmlinux.h

Generating vmlinux.h

Advantages of vmlinux.h

2.2 BPF_CORE_READ Macro

Syntax Format

Usage Examples

2.3 BPF_CORE_READ_INTO() Macro

BPF_CORE_READ_INTO (Read to Variable)

2.4 BPF_CORE_READ_STR_INTO() Macro

BPF_CORE_READ_STR_INTO (Read String)

2.5 bpf_probe_read vs bpf_core_read vs BPF_CORE_READ Detailed Explanation

Core Differences Overview

1. bpf_probe_read - Traditional Memory Read Function

2. bpf_core_read - CO-RE Inline Function

3. BPF_CORE_READ - Recommended CO-RE Macro ⭐⭐⭐⭐⭐

Practical Comparison: Reading Parent Process PID

Method 1: bpf_probe_read (Not Recommended)

Method 2: bpf_core_read (Usable, but Verbose)

Method 3: BPF_CORE_READ (Recommended!) ⭐⭐⭐⭐⭐

Common Misconceptions

Misconception 1: Confusing bpf_core_read Function and BPF_CORE_READ Macro

Misconception 2: Directly Accessing Nested Fields in bpf_core_read

Misconception 3: Using BPF_CORE_READ Where bpf_probe_read_user Should Be Used

Selection Guide

Best Practice Recommendations

4. Common Incorrect Usage Comparison

Error Example 1: Direct Pointer Access

Error Example 2: Using bpf_probe_read

Correct Example

2.6 bpf_get_current_task_btf() Function

Comparison of Two Ways to Get task_struct

Usage Example

3. Practical Example: Monitoring open System Call

Complete eBPF Kernel Program

User-Space Program

4. Common Questions

Q1: What's the relationship between BTF and CO-RE?

Q2: Do all kernels support BTF?

Q3: What's the difference between bpf_probe_read, bpf_core_read, and BPF_CORE_READ?

Q4: Why sometimes use bpf_get_current_task(), and sometimes bpf_get_current_task_btf()?