Files
nannyagent/README.md
Harshavardhan Musanalli 1f01c38881 Initial Commit
2025-09-27 17:35:24 +02:00

200 lines
6.0 KiB
Markdown

# Linux Diagnostic Agent
A Go-based AI agent that diagnoses Linux system issues using the NannyAPI gateway with OpenAI-compatible SDK.
## Features
- Interactive command-line interface for submitting system issues
- **Automatic system information gathering** - Includes OS, kernel, CPU, memory, network info
- Integrates with NannyAPI using OpenAI-compatible Go SDK
- Executes diagnostic commands safely and collects output
- Provides step-by-step resolution plans
- **Comprehensive integration tests** with realistic Linux problem scenarios
## Setup
1. Clone this repository
2. Copy `.env.example` to `.env` and configure your NannyAPI endpoint:
```bash
cp .env.example .env
```
3. Install dependencies:
```bash
go mod tidy
```
4. Build and run:
```bash
make build
./nanny-agent
```
## Configuration
The agent can be configured using environment variables:
- `NANNYAPI_ENDPOINT`: The NannyAPI endpoint (default: `http://nannyapi.local:3000/openai/v1`)
- `NANNYAPI_MODEL`: The model identifier (default: `nannyapi::function_name::diagnose_and_heal`)
## Installation on Linux VM
### Direct Installation
1. **Install Go** (if not already installed):
```bash
# For Ubuntu/Debian
sudo apt update
sudo apt install golang-go
# For RHEL/CentOS/Fedora
sudo dnf install golang
# or
sudo yum install golang
```
2. **Clone and build the agent**:
```bash
git clone <your-repo-url>
cd nannyagentv2
go mod tidy
make build
```
3. **Install as system service** (optional):
```bash
sudo cp nanny-agent /usr/local/bin/
sudo chmod +x /usr/local/bin/nanny-agent
```
4. **Set environment variables**:
```bash
export NANNYAPI_ENDPOINT="http://your-nannyapi-endpoint:3000/openai/v1"
export NANNYAPI_MODEL="your-model-identifier"
```
## Usage
1. Start the agent:
```bash
./nanny-agent
```
2. Enter a system issue description when prompted:
```
> On /var filesystem I cannot create any file but df -h shows 30% free space available.
```
3. The agent will:
- Send the issue to the AI via NannyAPI using OpenAI SDK
- Execute diagnostic commands as suggested by the AI
- Provide command outputs back to the AI
- Display the final diagnosis and resolution plan
4. Type `quit` or `exit` to stop the agent
## How It Works
1. **System Information Gathering**: Agent automatically collects system details (OS, kernel, CPU, memory, network, etc.)
2. **Initial Issue**: User describes a Linux system problem
3. **Enhanced Prompt**: AI receives both the issue description and comprehensive system information
4. **Diagnostic Phase**: AI responds with diagnostic commands to run
5. **Command Execution**: Agent safely executes read-only commands
6. **Iterative Analysis**: AI analyzes command outputs and may request more commands
7. **Resolution Phase**: AI provides root cause analysis and step-by-step resolution plan
## Testing & Integration Tests
The agent includes comprehensive integration tests that simulate realistic Linux problems:
### Available Test Scenarios:
1. **Disk Space Issues** - Inode exhaustion scenarios
2. **Memory Problems** - OOM killer and memory pressure
3. **Network Issues** - DNS resolution problems
4. **Performance Issues** - High load averages and I/O bottlenecks
5. **Web Server Problems** - Permission and configuration issues
6. **Hardware/Boot Issues** - Kernel module and device problems
7. **Database Performance** - Slow queries and I/O contention
8. **Service Failures** - Startup and configuration problems
### Run Integration Tests:
```bash
# Interactive test scenarios
./test-examples.sh
# Automated integration tests
./integration-tests.sh
# Function discovery (find valid NannyAPI functions)
./discover-functions.sh
```
## Safety
- Only read-only commands are executed automatically
- Commands that modify the system (rm, mv, dd, redirection) are blocked by validation
- The resolution plan is provided for manual execution by the operator
- All commands have execution timeouts to prevent hanging
## API Integration
The agent uses the `github.com/sashabaranov/go-openai` SDK to communicate with NannyAPI's OpenAI-compatible API endpoint. This provides:
- Robust HTTP client with retries and timeouts
- Structured request/response handling
- Automatic JSON marshaling/unmarshaling
- Error handling and validation
## Example Session
```
Linux Diagnostic Agent Started
Enter a system issue description (or 'quit' to exit):
> Cannot create files in /var but df shows space available
Diagnosing issue: Cannot create files in /var but df shows space available
Gathering system information...
AI Response:
{
"response_type": "diagnostic",
"reasoning": "The 'No space left on device' error despite available disk space suggests inode exhaustion...",
"commands": [
{"id": "check_inodes", "command": "df -i /var", "description": "Check inode usage..."}
]
}
Executing command 'check_inodes': df -i /var
Output:
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda1 1000000 999999 1 100% /var
=== DIAGNOSIS COMPLETE ===
Root Cause: The /var filesystem has exhausted all available inodes
Resolution Plan: 1. Find and remove unnecessary files...
Confidence: High
```
Note: The AI receives comprehensive system information including:
- Hostname, OS version, kernel version
- CPU cores, memory, system uptime
- Network interfaces and private IPs
- Current load average and disk usage
## Available Make Commands
- `make build` - Build the application
- `make run` - Build and run the application
- `make clean` - Clean build artifacts
- `make test` - Run unit tests
- `make install` - Install dependencies
- `make build-prod` - Build for production
- `make install-system` - Install system-wide (requires sudo)
- `make fmt` - Format code
- `make help` - Show available commands
## Testing Commands
- `./test-examples.sh` - Show interactive test scenarios
- `./integration-tests.sh` - Run automated integration tests
- `./discover-functions.sh` - Find available NannyAPI functions
- `./install.sh` - Installation script for Linux VMs