6.0 KiB
6.0 KiB
Linux Diagnostic Agent
A Go-based AI agent that diagnoses Linux system issues using the NannyAPI gateway with OpenAI-compatible SDK.
Features
- Interactive command-line interface for submitting system issues
- Automatic system information gathering - Includes OS, kernel, CPU, memory, network info
- Integrates with NannyAPI using OpenAI-compatible Go SDK
- Executes diagnostic commands safely and collects output
- Provides step-by-step resolution plans
- Comprehensive integration tests with realistic Linux problem scenarios
Setup
- Clone this repository
- Copy
.env.exampleto.envand configure your NannyAPI endpoint:cp .env.example .env - Install dependencies:
go mod tidy - Build and run:
make build ./nanny-agent
Configuration
The agent can be configured using environment variables:
NANNYAPI_ENDPOINT: The NannyAPI endpoint (default:http://nannyapi.local:3000/openai/v1)NANNYAPI_MODEL: The model identifier (default:nannyapi::function_name::diagnose_and_heal)
Installation on Linux VM
Direct Installation
-
Install Go (if not already installed):
# For Ubuntu/Debian sudo apt update sudo apt install golang-go # For RHEL/CentOS/Fedora sudo dnf install golang # or sudo yum install golang -
Clone and build the agent:
git clone <your-repo-url> cd nannyagentv2 go mod tidy make build -
Install as system service (optional):
sudo cp nanny-agent /usr/local/bin/ sudo chmod +x /usr/local/bin/nanny-agent -
Set environment variables:
export NANNYAPI_ENDPOINT="http://your-nannyapi-endpoint:3000/openai/v1" export NANNYAPI_MODEL="your-model-identifier"
Usage
-
Start the agent:
./nanny-agent -
Enter a system issue description when prompted:
> On /var filesystem I cannot create any file but df -h shows 30% free space available. -
The agent will:
- Send the issue to the AI via NannyAPI using OpenAI SDK
- Execute diagnostic commands as suggested by the AI
- Provide command outputs back to the AI
- Display the final diagnosis and resolution plan
-
Type
quitorexitto stop the agent
How It Works
- System Information Gathering: Agent automatically collects system details (OS, kernel, CPU, memory, network, etc.)
- Initial Issue: User describes a Linux system problem
- Enhanced Prompt: AI receives both the issue description and comprehensive system information
- Diagnostic Phase: AI responds with diagnostic commands to run
- Command Execution: Agent safely executes read-only commands
- Iterative Analysis: AI analyzes command outputs and may request more commands
- Resolution Phase: AI provides root cause analysis and step-by-step resolution plan
Testing & Integration Tests
The agent includes comprehensive integration tests that simulate realistic Linux problems:
Available Test Scenarios:
- Disk Space Issues - Inode exhaustion scenarios
- Memory Problems - OOM killer and memory pressure
- Network Issues - DNS resolution problems
- Performance Issues - High load averages and I/O bottlenecks
- Web Server Problems - Permission and configuration issues
- Hardware/Boot Issues - Kernel module and device problems
- Database Performance - Slow queries and I/O contention
- Service Failures - Startup and configuration problems
Run Integration Tests:
# Interactive test scenarios
./test-examples.sh
# Automated integration tests
./integration-tests.sh
# Function discovery (find valid NannyAPI functions)
./discover-functions.sh
Safety
- Only read-only commands are executed automatically
- Commands that modify the system (rm, mv, dd, redirection) are blocked by validation
- The resolution plan is provided for manual execution by the operator
- All commands have execution timeouts to prevent hanging
API Integration
The agent uses the github.com/sashabaranov/go-openai SDK to communicate with NannyAPI's OpenAI-compatible API endpoint. This provides:
- Robust HTTP client with retries and timeouts
- Structured request/response handling
- Automatic JSON marshaling/unmarshaling
- Error handling and validation
Example Session
Linux Diagnostic Agent Started
Enter a system issue description (or 'quit' to exit):
> Cannot create files in /var but df shows space available
Diagnosing issue: Cannot create files in /var but df shows space available
Gathering system information...
AI Response:
{
"response_type": "diagnostic",
"reasoning": "The 'No space left on device' error despite available disk space suggests inode exhaustion...",
"commands": [
{"id": "check_inodes", "command": "df -i /var", "description": "Check inode usage..."}
]
}
Executing command 'check_inodes': df -i /var
Output:
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda1 1000000 999999 1 100% /var
=== DIAGNOSIS COMPLETE ===
Root Cause: The /var filesystem has exhausted all available inodes
Resolution Plan: 1. Find and remove unnecessary files...
Confidence: High
Note: The AI receives comprehensive system information including:
- Hostname, OS version, kernel version
- CPU cores, memory, system uptime
- Network interfaces and private IPs
- Current load average and disk usage
Available Make Commands
make build- Build the applicationmake run- Build and run the applicationmake clean- Clean build artifactsmake test- Run unit testsmake install- Install dependenciesmake build-prod- Build for productionmake install-system- Install system-wide (requires sudo)make fmt- Format codemake help- Show available commands
Testing Commands
./test-examples.sh- Show interactive test scenarios./integration-tests.sh- Run automated integration tests./discover-functions.sh- Find available NannyAPI functions./install.sh- Installation script for Linux VMs