234 lines
6.8 KiB
Markdown
234 lines
6.8 KiB
Markdown
# eBPF Integration for Linux Diagnostic Agent
|
|
|
|
The Linux Diagnostic Agent now includes comprehensive eBPF (Extended Berkeley Packet Filter) capabilities for advanced system monitoring and investigation during diagnostic sessions.
|
|
|
|
## eBPF Capabilities
|
|
|
|
### Available Monitoring Types
|
|
|
|
1. **System Call Tracing** (`syscall_trace`)
|
|
- Monitors all system calls made by processes
|
|
- Useful for debugging process behavior and API usage
|
|
- Can filter by process ID or name
|
|
|
|
2. **Network Activity Tracing** (`network_trace`)
|
|
- Tracks TCP/UDP send/receive operations
|
|
- Monitors network connections and data flow
|
|
- Identifies network-related bottlenecks
|
|
|
|
3. **Process Monitoring** (`process_trace`)
|
|
- Tracks process creation, execution, and termination
|
|
- Monitors process lifecycle events
|
|
- Useful for debugging startup issues
|
|
|
|
4. **File System Monitoring** (`file_trace`)
|
|
- Monitors file open, create, delete operations
|
|
- Tracks file access patterns
|
|
- Can filter by specific paths
|
|
|
|
5. **Performance Monitoring** (`performance`)
|
|
- Collects CPU, memory, and I/O metrics
|
|
- Provides detailed performance profiling
|
|
- Uses perf integration when available
|
|
|
|
6. **Security Event Monitoring** (`security_event`)
|
|
- Detects privilege escalation attempts
|
|
- Monitors security-relevant system calls
|
|
- Tracks suspicious activities
|
|
|
|
## How eBPF Integration Works
|
|
|
|
### AI-Driven eBPF Selection
|
|
|
|
The AI agent can automatically request eBPF monitoring by including specific fields in its diagnostic response:
|
|
|
|
```json
|
|
{
|
|
"response_type": "diagnostic",
|
|
"reasoning": "Need to trace network activity to diagnose connection timeout issues",
|
|
"commands": [
|
|
{"id": "basic_net", "command": "ss -tulpn", "description": "Current network connections"},
|
|
{"id": "net_config", "command": "ip route show", "description": "Network configuration"}
|
|
],
|
|
"ebpf_capabilities": ["network_trace", "syscall_trace"],
|
|
"ebpf_duration_seconds": 15,
|
|
"ebpf_filters": {
|
|
"comm": "nginx",
|
|
"path": "/etc"
|
|
}
|
|
}
|
|
```
|
|
|
|
### eBPF Trace Execution
|
|
|
|
1. eBPF traces run in parallel with regular diagnostic commands
|
|
2. Multiple eBPF capabilities can be activated simultaneously
|
|
3. Traces collect structured JSON events in real-time
|
|
4. Results are automatically parsed and included in the diagnostic data
|
|
|
|
### Event Data Structure
|
|
|
|
eBPF events follow a consistent structure:
|
|
|
|
```json
|
|
{
|
|
"timestamp": 1634567890000000000,
|
|
"event_type": "syscall_enter",
|
|
"process_id": 1234,
|
|
"process_name": "nginx",
|
|
"user_id": 1000,
|
|
"data": {
|
|
"syscall": "openat",
|
|
"filename": "/etc/nginx/nginx.conf"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Installation and Setup
|
|
|
|
### Prerequisites
|
|
|
|
The agent automatically detects available eBPF tools and capabilities. For full functionality, install:
|
|
|
|
**Ubuntu/Debian:**
|
|
```bash
|
|
sudo apt update
|
|
sudo apt install bpftrace linux-tools-generic linux-tools-$(uname -r)
|
|
sudo apt install bcc-tools python3-bcc # Optional, for additional tools
|
|
```
|
|
|
|
**RHEL/CentOS/Fedora:**
|
|
```bash
|
|
sudo dnf install bpftrace perf bcc-tools python3-bcc
|
|
```
|
|
|
|
**openSUSE:**
|
|
```bash
|
|
sudo zypper install bpftrace perf
|
|
```
|
|
|
|
### Automated Setup
|
|
|
|
Use the included helper script:
|
|
|
|
```bash
|
|
# Check current eBPF capabilities
|
|
./ebpf_helper.sh check
|
|
|
|
# Install eBPF tools (requires root)
|
|
sudo ./ebpf_helper.sh install
|
|
|
|
# Create monitoring scripts
|
|
./ebpf_helper.sh setup
|
|
|
|
# Test eBPF functionality
|
|
sudo ./ebpf_helper.sh test
|
|
```
|
|
|
|
## Usage Examples
|
|
|
|
### Network Issue Diagnosis
|
|
|
|
When describing network problems, the AI may automatically request network tracing:
|
|
|
|
```
|
|
User: "Web server is experiencing intermittent connection timeouts"
|
|
|
|
AI Response: Includes network_trace and syscall_trace capabilities
|
|
eBPF Output: Real-time network send/receive events, connection attempts, and related system calls
|
|
```
|
|
|
|
### Performance Issue Investigation
|
|
|
|
For performance problems, the AI can request comprehensive monitoring:
|
|
|
|
```
|
|
User: "System is running slowly, high CPU usage"
|
|
|
|
AI Response: Includes process_trace, performance, and syscall_trace
|
|
eBPF Output: Process execution patterns, performance metrics, and system call analysis
|
|
```
|
|
|
|
### Security Incident Analysis
|
|
|
|
For security concerns, specialized monitoring is available:
|
|
|
|
```
|
|
User: "Suspicious activity detected, possible privilege escalation"
|
|
|
|
AI Response: Includes security_event, process_trace, and file_trace
|
|
eBPF Output: Security-relevant events, process behavior, and file access patterns
|
|
```
|
|
|
|
## Filtering Options
|
|
|
|
eBPF traces can be filtered for focused monitoring:
|
|
|
|
- **Process ID**: `{"pid": "1234"}` - Monitor specific process
|
|
- **Process Name**: `{"comm": "nginx"}` - Monitor processes by name
|
|
- **File Path**: `{"path": "/etc"}` - Monitor specific path (file tracing)
|
|
|
|
## Integration with Existing Workflow
|
|
|
|
eBPF monitoring integrates seamlessly with the existing diagnostic workflow:
|
|
|
|
1. **Automatic Detection**: Agent detects available eBPF capabilities at startup
|
|
2. **AI Decision Making**: AI decides when eBPF monitoring would be helpful
|
|
3. **Parallel Execution**: eBPF traces run alongside regular diagnostic commands
|
|
4. **Structured Results**: eBPF data is included in command results for AI analysis
|
|
5. **Contextual Analysis**: AI correlates eBPF events with other diagnostic data
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
**Permission Errors:**
|
|
- Most eBPF operations require root privileges
|
|
- Run the agent with `sudo` for full eBPF functionality
|
|
|
|
**Tool Not Available:**
|
|
- Use `./ebpf_helper.sh check` to verify available tools
|
|
- Install missing tools with `./ebpf_helper.sh install`
|
|
|
|
**Kernel Compatibility:**
|
|
- eBPF requires Linux kernel 4.4+ (5.0+ recommended)
|
|
- Some features may require newer kernel versions
|
|
|
|
**Debugging eBPF Issues:**
|
|
```bash
|
|
# Check kernel eBPF support
|
|
sudo ./ebpf_helper.sh check
|
|
|
|
# Test basic eBPF functionality
|
|
sudo bpftrace -e 'BEGIN { print("eBPF works!"); exit(); }'
|
|
|
|
# Verify debugfs mount (required for ftrace)
|
|
sudo mount -t debugfs none /sys/kernel/debug
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
- eBPF monitoring provides deep system visibility
|
|
- Traces may contain sensitive information (file paths, process arguments)
|
|
- Traces are stored temporarily in `/tmp/nannyagent/ebpf/`
|
|
- Old traces are automatically cleaned up after 1 hour
|
|
- Consider the security implications of detailed system monitoring
|
|
|
|
## Performance Impact
|
|
|
|
- eBPF monitoring has minimal performance overhead
|
|
- Traces are time-limited (typically 10-30 seconds)
|
|
- Event collection is optimized for efficiency
|
|
- Heavy tracing may impact system performance on resource-constrained systems
|
|
|
|
## Contributing
|
|
|
|
To add new eBPF capabilities:
|
|
|
|
1. Extend the `EBPFCapability` enum in `ebpf_manager.go`
|
|
2. Add detection logic in `detectCapabilities()`
|
|
3. Implement trace command generation in `buildXXXTraceCommand()`
|
|
4. Update capability descriptions in `FormatSystemInfoWithEBPFForPrompt()`
|
|
|
|
The eBPF integration is designed to be extensible and can accommodate additional monitoring capabilities as needed.
|