Initial Commit

2025-09-27 17:35:24 +02:00
parent 83fa088ed2
commit 1f01c38881
14 changed files with 1277 additions and 2 deletions
--- a/0
+++ b/0
--- a/53
+++ b/53
@@ -0,0 +1,53 @@
 .PHONY: build run clean test install
 # Build the application
 build:
 	go build -o nanny-agent .
 # Run the application
 run: build
 	./nanny-agent
 # Clean build artifacts
 clean:
 	rm -f nanny-agent
 # Run tests
 test:
 	go test ./...
 # Install dependencies
 install:
 	go mod tidy
 	go mod download
 # Build for production with optimizations
 build-prod:
 	CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-w -s' -o nanny-agent .
 # Install system-wide (requires sudo)
 install-system: build-prod
 	sudo cp nanny-agent /usr/local/bin/
 	sudo chmod +x /usr/local/bin/nanny-agent
 # Format code
 fmt:
 	go fmt ./...
 # Run linter (if golangci-lint is installed)
 lint:
 	golangci-lint run
 # Show help
 help:
 	@echo "Available commands:"
 	@echo "  build         - Build the application"
 	@echo "  run           - Build and run the application"
 	@echo "  clean         - Clean build artifacts"
 	@echo "  test          - Run tests"
 	@echo "  install       - Install dependencies"
 	@echo "  build-prod    - Build for production"
 	@echo "  install-system- Install system-wide (requires sudo)"
 	@echo "  fmt           - Format code"
 	@echo "  lint          - Run linter"
 	@echo "  help          - Show this help"
--- a/README.md
+++ b/README.md
@@ -1,3 +1,199 @@
-# nannyagent
+# Linux Diagnostic Agent
-nannyagent is a Linux AI diagnostic agent built on OpenAPI specifications relying on Tensorzero gateway
+A Go-based AI agent that diagnoses Linux system issues using the NannyAPI gateway with OpenAI-compatible SDK.
 ## Features
 - Interactive command-line interface for submitting system issues
 - **Automatic system information gathering** - Includes OS, kernel, CPU, memory, network info
 - Integrates with NannyAPI using OpenAI-compatible Go SDK
 - Executes diagnostic commands safely and collects output
 - Provides step-by-step resolution plans
 - **Comprehensive integration tests** with realistic Linux problem scenarios
 ## Setup
 1. Clone this repository
 2. Copy `.env.example` to `.env` and configure your NannyAPI endpoint:
   ```bash
   cp .env.example .env
   ```
 3. Install dependencies:
   ```bash
   go mod tidy
   ```
 4. Build and run:
   ```bash
   make build
   ./nanny-agent
   ```
 ## Configuration
 The agent can be configured using environment variables:
 - `NANNYAPI_ENDPOINT`: The NannyAPI endpoint (default: `http://nannyapi.local:3000/openai/v1`)
 - `NANNYAPI_MODEL`: The model identifier (default: `nannyapi::function_name::diagnose_and_heal`)
 ## Installation on Linux VM
 ### Direct Installation
 1. **Install Go** (if not already installed):
   ```bash
   # For Ubuntu/Debian
   sudo apt update
   sudo apt install golang-go
   # For RHEL/CentOS/Fedora
   sudo dnf install golang
   # or
   sudo yum install golang
   ```
 2. **Clone and build the agent**:
   ```bash
   git clone <your-repo-url>
   cd nannyagentv2
   go mod tidy
   make build
   ```
 3. **Install as system service** (optional):
   ```bash
   sudo cp nanny-agent /usr/local/bin/
   sudo chmod +x /usr/local/bin/nanny-agent
   ```
 4. **Set environment variables**:
   ```bash
   export NANNYAPI_ENDPOINT="http://your-nannyapi-endpoint:3000/openai/v1"
   export NANNYAPI_MODEL="your-model-identifier"
   ```
 ## Usage
 1. Start the agent:
   ```bash
   ./nanny-agent
   ```
 2. Enter a system issue description when prompted:
   ```
   > On /var filesystem I cannot create any file but df -h shows 30% free space available.
   ```
 3. The agent will:
   - Send the issue to the AI via NannyAPI using OpenAI SDK
   - Execute diagnostic commands as suggested by the AI
   - Provide command outputs back to the AI
   - Display the final diagnosis and resolution plan
 4. Type `quit` or `exit` to stop the agent
 ## How It Works
 1. **System Information Gathering**: Agent automatically collects system details (OS, kernel, CPU, memory, network, etc.)
 2. **Initial Issue**: User describes a Linux system problem
 3. **Enhanced Prompt**: AI receives both the issue description and comprehensive system information
 4. **Diagnostic Phase**: AI responds with diagnostic commands to run
 5. **Command Execution**: Agent safely executes read-only commands
 6. **Iterative Analysis**: AI analyzes command outputs and may request more commands
 7. **Resolution Phase**: AI provides root cause analysis and step-by-step resolution plan
 ## Testing & Integration Tests
 The agent includes comprehensive integration tests that simulate realistic Linux problems:
 ### Available Test Scenarios:
 1. **Disk Space Issues** - Inode exhaustion scenarios
 2. **Memory Problems** - OOM killer and memory pressure
 3. **Network Issues** - DNS resolution problems
 4. **Performance Issues** - High load averages and I/O bottlenecks
 5. **Web Server Problems** - Permission and configuration issues
 6. **Hardware/Boot Issues** - Kernel module and device problems
 7. **Database Performance** - Slow queries and I/O contention
 8. **Service Failures** - Startup and configuration problems
 ### Run Integration Tests:
 ```bash
 # Interactive test scenarios
 ./test-examples.sh
 # Automated integration tests
 ./integration-tests.sh
 # Function discovery (find valid NannyAPI functions)
 ./discover-functions.sh
 ```
 ## Safety
 - Only read-only commands are executed automatically
 - Commands that modify the system (rm, mv, dd, redirection) are blocked by validation
 - The resolution plan is provided for manual execution by the operator
 - All commands have execution timeouts to prevent hanging
 ## API Integration
 The agent uses the `github.com/sashabaranov/go-openai` SDK to communicate with NannyAPI's OpenAI-compatible API endpoint. This provides:
 - Robust HTTP client with retries and timeouts
 - Structured request/response handling
 - Automatic JSON marshaling/unmarshaling
 - Error handling and validation
 ## Example Session
 ```
 Linux Diagnostic Agent Started
 Enter a system issue description (or 'quit' to exit):
 > Cannot create files in /var but df shows space available
 Diagnosing issue: Cannot create files in /var but df shows space available
 Gathering system information...
 AI Response:
 {
  "response_type": "diagnostic",
  "reasoning": "The 'No space left on device' error despite available disk space suggests inode exhaustion...",
  "commands": [
    {"id": "check_inodes", "command": "df -i /var", "description": "Check inode usage..."}
  ]
 }
 Executing command 'check_inodes': df -i /var
 Output:
 Filesystem      Inodes   IUsed   IFree IUse% Mounted on
 /dev/sda1      1000000  999999       1  100% /var
 === DIAGNOSIS COMPLETE ===
 Root Cause: The /var filesystem has exhausted all available inodes
 Resolution Plan: 1. Find and remove unnecessary files...
 Confidence: High
 ```
 Note: The AI receives comprehensive system information including:
 - Hostname, OS version, kernel version
 - CPU cores, memory, system uptime
 - Network interfaces and private IPs
 - Current load average and disk usage
 ## Available Make Commands
 - `make build` - Build the application
 - `make run` - Build and run the application  
 - `make clean` - Clean build artifacts
 - `make test` - Run unit tests
 - `make install` - Install dependencies
 - `make build-prod` - Build for production
 - `make install-system` - Install system-wide (requires sudo)
 - `make fmt` - Format code
 - `make help` - Show available commands
 ## Testing Commands
 - `./test-examples.sh` - Show interactive test scenarios
 - `./integration-tests.sh` - Run automated integration tests
 - `./discover-functions.sh` - Find available NannyAPI functions
 - `./install.sh` - Installation script for Linux VMs
--- a/agent.go
+++ b/agent.go
@@ -0,0 +1,270 @@
 package main
 import (
 	"bytes"
 	"context"
 	"encoding/json"
 	"fmt"
 	"io"
 	"net/http"
 	"os"
 	"time"
 	"github.com/sashabaranov/go-openai"
 )
 // DiagnosticResponse represents the diagnostic phase response from AI
 type DiagnosticResponse struct {
 	ResponseType string    `json:"response_type"`
 	Reasoning    string    `json:"reasoning"`
 	Commands     []Command `json:"commands"`
 }
 // ResolutionResponse represents the resolution phase response from AI
 type ResolutionResponse struct {
 	ResponseType   string `json:"response_type"`
 	RootCause      string `json:"root_cause"`
 	ResolutionPlan string `json:"resolution_plan"`
 	Confidence     string `json:"confidence"`
 }
 // Command represents a command to be executed
 type Command struct {
 	ID          string `json:"id"`
 	Command     string `json:"command"`
 	Description string `json:"description"`
 }
 // CommandResult represents the result of executing a command
 type CommandResult struct {
 	ID       string `json:"id"`
 	Command  string `json:"command"`
 	Output   string `json:"output"`
 	ExitCode int    `json:"exit_code"`
 	Error    string `json:"error,omitempty"`
 }
 // LinuxDiagnosticAgent represents the main agent
 type LinuxDiagnosticAgent struct {
 	client    *openai.Client
 	model     string
 	executor  *CommandExecutor
 	episodeID string // TensorZero episode ID for conversation continuity
 }
 // NewLinuxDiagnosticAgent creates a new diagnostic agent
 func NewLinuxDiagnosticAgent() *LinuxDiagnosticAgent {
 	endpoint := os.Getenv("NANNYAPI_ENDPOINT")
 	if endpoint == "" {
 		// Default endpoint - OpenAI SDK will append /chat/completions automatically
 		endpoint = "http://nannyapi.local:3000/openai/v1"
 	}
 	model := os.Getenv("NANNYAPI_MODEL")
 	if model == "" {
 		model = "nannyapi::function_name::diagnose_and_heal"
 		fmt.Printf("Warning: Using default model '%s'. Set NANNYAPI_MODEL environment variable for your specific function.\n", model)
 	}
 	// Create OpenAI client with custom base URL
 	// Note: The OpenAI SDK automatically appends "/chat/completions" to the base URL
 	config := openai.DefaultConfig("")
 	config.BaseURL = endpoint
 	client := openai.NewClientWithConfig(config)
 	return &LinuxDiagnosticAgent{
 		client:   client,
 		model:    model,
 		executor: NewCommandExecutor(10 * time.Second), // 10 second timeout for commands
 	}
 }
 // DiagnoseIssue starts the diagnostic process for a given issue
 func (a *LinuxDiagnosticAgent) DiagnoseIssue(issue string) error {
 	fmt.Printf("Diagnosing issue: %s\n", issue)
 	fmt.Println("Gathering system information...")
 	// Gather system information
 	systemInfo := GatherSystemInfo()
 	// Format the initial prompt with system information
 	initialPrompt := FormatSystemInfoForPrompt(systemInfo) + "\n" + issue
 	// Start conversation with initial issue including system info
 	messages := []openai.ChatCompletionMessage{
 		{
 			Role:    openai.ChatMessageRoleUser,
 			Content: initialPrompt,
 		},
 	}
 	for {
 		// Send request to TensorZero API via OpenAI SDK
 		response, err := a.sendRequest(messages)
 		if err != nil {
 			return fmt.Errorf("failed to send request: %w", err)
 		}
 		if len(response.Choices) == 0 {
 			return fmt.Errorf("no choices in response")
 		}
 		content := response.Choices[0].Message.Content
 		fmt.Printf("\nAI Response:\n%s\n", content)
 		// Parse the response to determine next action
 		var diagnosticResp DiagnosticResponse
 		var resolutionResp ResolutionResponse
 		// Try to parse as diagnostic response first
 		if err := json.Unmarshal([]byte(content), &diagnosticResp); err == nil && diagnosticResp.ResponseType == "diagnostic" {
 			// Handle diagnostic phase
 			fmt.Printf("\nReasoning: %s\n", diagnosticResp.Reasoning)
 			if len(diagnosticResp.Commands) == 0 {
 				fmt.Println("No commands to execute in diagnostic phase")
 				break
 			}
 			// Execute commands and collect results
 			commandResults := make([]CommandResult, 0, len(diagnosticResp.Commands))
 			for _, cmd := range diagnosticResp.Commands {
 				fmt.Printf("\nExecuting command '%s': %s\n", cmd.ID, cmd.Command)
 				result := a.executor.Execute(cmd)
 				commandResults = append(commandResults, result)
 				fmt.Printf("Output:\n%s\n", result.Output)
 				if result.Error != "" {
 					fmt.Printf("Error: %s\n", result.Error)
 				}
 			}
 			// Prepare command results as user message
 			resultsJSON, err := json.MarshalIndent(commandResults, "", "  ")
 			if err != nil {
 				return fmt.Errorf("failed to marshal command results: %w", err)
 			}
 			// Add AI response and command results to conversation
 			messages = append(messages, openai.ChatCompletionMessage{
 				Role:    openai.ChatMessageRoleAssistant,
 				Content: content,
 			})
 			messages = append(messages, openai.ChatCompletionMessage{
 				Role:    openai.ChatMessageRoleUser,
 				Content: string(resultsJSON),
 			})
 			continue
 		}
 		// Try to parse as resolution response
 		if err := json.Unmarshal([]byte(content), &resolutionResp); err == nil && resolutionResp.ResponseType == "resolution" {
 			// Handle resolution phase
 			fmt.Printf("\n=== DIAGNOSIS COMPLETE ===\n")
 			fmt.Printf("Root Cause: %s\n", resolutionResp.RootCause)
 			fmt.Printf("Resolution Plan: %s\n", resolutionResp.ResolutionPlan)
 			fmt.Printf("Confidence: %s\n", resolutionResp.Confidence)
 			break
 		}
 		// If we can't parse the response, treat it as an error or unexpected format
 		fmt.Printf("Unexpected response format or error from AI:\n%s\n", content)
 		break
 	}
 	return nil
 }
 // TensorZeroRequest represents a request structure compatible with TensorZero's episode_id
 type TensorZeroRequest struct {
 	Model     string                         `json:"model"`
 	Messages  []openai.ChatCompletionMessage `json:"messages"`
 	EpisodeID string                         `json:"tensorzero::episode_id,omitempty"`
 }
 // TensorZeroResponse represents TensorZero's response with episode_id
 type TensorZeroResponse struct {
 	openai.ChatCompletionResponse
 	EpisodeID string `json:"episode_id"`
 }
 // sendRequest sends a request to the TensorZero API with tensorzero::episode_id support
 func (a *LinuxDiagnosticAgent) sendRequest(messages []openai.ChatCompletionMessage) (*openai.ChatCompletionResponse, error) {
 	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
 	defer cancel()
 	// Create TensorZero-compatible request
 	tzRequest := TensorZeroRequest{
 		Model:    a.model,
 		Messages: messages,
 	}
 	// Include tensorzero::episode_id for conversation continuity (if we have one)
 	if a.episodeID != "" {
 		tzRequest.EpisodeID = a.episodeID
 	}
 	fmt.Printf("Debug: Sending request to model: %s", a.model)
 	if a.episodeID != "" {
 		fmt.Printf(" (episode: %s)", a.episodeID)
 	}
 	fmt.Println()
 	// Marshal the request
 	requestBody, err := json.Marshal(tzRequest)
 	if err != nil {
 		return nil, fmt.Errorf("failed to marshal request: %w", err)
 	}
 	// Create HTTP request
 	endpoint := os.Getenv("NANNYAPI_ENDPOINT")
 	if endpoint == "" {
 		endpoint = "http://nannyapi.local:3000/openai/v1"
 	}
 	// Ensure the endpoint ends with /chat/completions
 	if endpoint[len(endpoint)-1] != '/' {
 		endpoint += "/"
 	}
 	endpoint += "chat/completions"
 	req, err := http.NewRequestWithContext(ctx, "POST", endpoint, bytes.NewBuffer(requestBody))
 	if err != nil {
 		return nil, fmt.Errorf("failed to create request: %w", err)
 	}
 	req.Header.Set("Content-Type", "application/json")
 	// Make the request
 	client := &http.Client{Timeout: 30 * time.Second}
 	resp, err := client.Do(req)
 	if err != nil {
 		return nil, fmt.Errorf("failed to send request: %w", err)
 	}
 	defer resp.Body.Close()
 	// Read response body
 	body, err := io.ReadAll(resp.Body)
 	if err != nil {
 		return nil, fmt.Errorf("failed to read response: %w", err)
 	}
 	if resp.StatusCode != http.StatusOK {
 		return nil, fmt.Errorf("API request failed with status %d: %s", resp.StatusCode, string(body))
 	}
 	// Parse TensorZero response
 	var tzResponse TensorZeroResponse
 	if err := json.Unmarshal(body, &tzResponse); err != nil {
 		return nil, fmt.Errorf("failed to unmarshal response: %w", err)
 	}
 	// Extract episode_id from first response
 	if a.episodeID == "" && tzResponse.EpisodeID != "" {
 		a.episodeID = tzResponse.EpisodeID
 		fmt.Printf("Debug: Extracted episode ID: %s\n", a.episodeID)
 	}
 	return &tzResponse.ChatCompletionResponse, nil
 }
--- a/agent_test.go
+++ b/agent_test.go
@@ -0,0 +1,107 @@
 package main
 import (
 	"testing"
 	"time"
 )
 func TestCommandExecutor_ValidateCommand(t *testing.T) {
 	executor := NewCommandExecutor(5 * time.Second)
 	tests := []struct {
 		name    string
 		command string
 		wantErr bool
 	}{
 		{
 			name:    "safe command - ls",
 			command: "ls -la /var",
 			wantErr: false,
 		},
 		{
 			name:    "safe command - df",
 			command: "df -h",
 			wantErr: false,
 		},
 		{
 			name:    "safe command - ps",
 			command: "ps aux | grep nginx",
 			wantErr: false,
 		},
 		{
 			name:    "dangerous command - rm",
 			command: "rm -rf /tmp/*",
 			wantErr: true,
 		},
 		{
 			name:    "dangerous command - dd",
 			command: "dd if=/dev/zero of=/dev/sda",
 			wantErr: true,
 		},
 		{
 			name:    "dangerous command - sudo",
 			command: "sudo systemctl stop nginx",
 			wantErr: true,
 		},
 		{
 			name:    "dangerous command - redirection",
 			command: "echo 'test' > /etc/passwd",
 			wantErr: true,
 		},
 	}
 	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
 			err := executor.validateCommand(tt.command)
 			if (err != nil) != tt.wantErr {
 				t.Errorf("validateCommand() error = %v, wantErr %v", err, tt.wantErr)
 			}
 		})
 	}
 }
 func TestCommandExecutor_Execute(t *testing.T) {
 	executor := NewCommandExecutor(5 * time.Second)
 	// Test safe command execution
 	cmd := Command{
 		ID:          "test_echo",
 		Command:     "echo 'Hello, World!'",
 		Description: "Test echo command",
 	}
 	result := executor.Execute(cmd)
 	if result.ExitCode != 0 {
 		t.Errorf("Expected exit code 0, got %d", result.ExitCode)
 	}
 	if result.Output != "Hello, World!\n" {
 		t.Errorf("Expected 'Hello, World!\\n', got '%s'", result.Output)
 	}
 	if result.Error != "" {
 		t.Errorf("Expected no error, got '%s'", result.Error)
 	}
 }
 func TestCommandExecutor_ExecuteUnsafeCommand(t *testing.T) {
 	executor := NewCommandExecutor(5 * time.Second)
 	// Test unsafe command rejection
 	cmd := Command{
 		ID:          "test_rm",
 		Command:     "rm -rf /tmp/test",
 		Description: "Dangerous rm command",
 	}
 	result := executor.Execute(cmd)
 	if result.ExitCode != 1 {
 		t.Errorf("Expected exit code 1 for unsafe command, got %d", result.ExitCode)
 	}
 	if result.Error == "" {
 		t.Error("Expected error for unsafe command, got none")
 	}
 }
--- a/discover-functions.sh
+++ b/discover-functions.sh
@@ -0,0 +1,51 @@
 #!/bin/bash
 # NannyAPI Function Discovery Script
 # This script helps you find the correct function name for your NannyAPI setup
 echo "🔍 NannyAPI Function Discovery"
 echo "=============================="
 echo ""
 ENDPOINT="${NANNYAPI_ENDPOINT:-http://nannyapi.local:3000/openai/v1}"
 echo "Testing endpoint: $ENDPOINT/chat/completions"
 echo ""
 # Test common function name patterns
 test_functions=(
    "nannyapi::function_name::diagnose"
    "nannyapi::function_name::diagnose_and_heal"
    "nannyapi::function_name::linux_diagnostic"
    "nannyapi::function_name::system_diagnostic"
    "nannyapi::model_name::gpt-4"
    "nannyapi::model_name::claude"
 )
 for func in "${test_functions[@]}"; do
    echo "Testing function: $func"
    response=$(curl -s -X POST "$ENDPOINT/chat/completions" \
        -H "Content-Type: application/json" \
        -d "{\"model\":\"$func\",\"messages\":[{\"role\":\"user\",\"content\":\"test\"}]}")
    if echo "$response" | grep -q "Unknown function"; then
        echo "  ❌ Function not found"
    elif echo "$response" | grep -q "error"; then
        echo "  ⚠️  Error: $(echo "$response" | jq -r '.error' 2>/dev/null || echo "$response")"
    else
        echo "  ✅ Function exists and responding!"
        echo "     Use this in your environment: export NANNYAPI_MODEL=\"$func\""
    fi
    echo ""
 done
 echo "💡 If none of the above work, check your NannyAPI configuration file"
 echo "   for the correct function names and update NANNYAPI_MODEL accordingly."
 echo ""
 echo "Example NannyAPI config snippet:"
 echo "```yaml"
 echo "functions:"
 echo "  diagnose_and_heal:  # This becomes 'nannyapi::function_name::diagnose_and_heal'"
 echo "    # function definition"
 echo "```"
--- a/executor.go
+++ b/executor.go
@@ -0,0 +1,108 @@
 package main
 import (
 	"context"
 	"fmt"
 	"os/exec"
 	"strings"
 	"time"
 )
 // CommandExecutor handles safe execution of diagnostic commands
 type CommandExecutor struct {
 	timeout time.Duration
 }
 // NewCommandExecutor creates a new command executor with specified timeout
 func NewCommandExecutor(timeout time.Duration) *CommandExecutor {
 	return &CommandExecutor{
 		timeout: timeout,
 	}
 }
 // Execute executes a command safely with timeout and validation
 func (ce *CommandExecutor) Execute(cmd Command) CommandResult {
 	result := CommandResult{
 		ID:      cmd.ID,
 		Command: cmd.Command,
 	}
 	// Validate command safety
 	if err := ce.validateCommand(cmd.Command); err != nil {
 		result.Error = fmt.Sprintf("unsafe command: %s", err.Error())
 		result.ExitCode = 1
 		return result
 	}
 	// Create context with timeout
 	ctx, cancel := context.WithTimeout(context.Background(), ce.timeout)
 	defer cancel()
 	// Execute command using shell for proper handling of pipes, redirects, etc.
 	execCmd := exec.CommandContext(ctx, "/bin/bash", "-c", cmd.Command)
 	output, err := execCmd.CombinedOutput()
 	result.Output = string(output)
 	if err != nil {
 		result.Error = err.Error()
 		if exitError, ok := err.(*exec.ExitError); ok {
 			result.ExitCode = exitError.ExitCode()
 		} else {
 			result.ExitCode = 1
 		}
 	} else {
 		result.ExitCode = 0
 	}
 	return result
 }
 // validateCommand checks if a command is safe to execute
 func (ce *CommandExecutor) validateCommand(command string) error {
 	// Convert to lowercase for case-insensitive checking
 	cmd := strings.ToLower(strings.TrimSpace(command))
 	// List of dangerous commands/patterns
 	dangerousPatterns := []string{
 		"rm ", "rm\t", "rm\n",
 		"mv ", "mv\t", "mv\n",
 		"dd ", "dd\t", "dd\n",
 		"mkfs", "fdisk", "parted",
 		"shutdown", "reboot", "halt", "poweroff",
 		"passwd", "userdel", "usermod",
 		"chmod", "chown", "chgrp",
 		"systemctl stop", "systemctl disable", "systemctl mask",
 		"service stop", "service disable",
 		"kill ", "killall", "pkill",
 		"crontab -r", "crontab -e",
 		"iptables -F", "iptables -D", "iptables -I",
 		"umount ", "unmount ", // Allow mount but not umount
 		"wget ", "curl ", // Prevent network operations
 		"| dd", "| rm", "| mv", // Prevent piping to dangerous commands
 	}
 	// Check for dangerous patterns
 	for _, pattern := range dangerousPatterns {
 		if strings.Contains(cmd, pattern) {
 			return fmt.Errorf("command contains dangerous pattern: %s", pattern)
 		}
 	}
 	// Additional checks for commands that start with dangerous operations
 	if strings.HasPrefix(cmd, "rm ") || strings.HasPrefix(cmd, "rm\t") {
 		return fmt.Errorf("rm command not allowed")
 	}
 	// Check for sudo usage (we want to avoid automated sudo commands)
 	if strings.HasPrefix(cmd, "sudo ") {
 		return fmt.Errorf("sudo commands not allowed for automated execution")
 	}
 	// Check for dangerous redirections (but allow safe ones like 2>/dev/null)
 	if strings.Contains(cmd, ">") && !strings.Contains(cmd, "2>/dev/null") && !strings.Contains(cmd, ">/dev/null") {
 		return fmt.Errorf("file redirection not allowed except to /dev/null")
 	}
 	return nil
 }
--- a/go.mod
+++ b/go.mod
@@ -0,0 +1,5 @@
 module nannyagentv2
 go 1.23
 require github.com/sashabaranov/go-openai v1.32.0
--- a/go.sum
+++ b/go.sum
@@ -0,0 +1,2 @@
 github.com/sashabaranov/go-openai v1.32.0 h1:Yk3iE9moX3RBXxrof3OBtUBrE7qZR0zF9ebsoO4zVzI=
 github.com/sashabaranov/go-openai v1.32.0/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=
--- a/install.sh
+++ b/install.sh
@@ -0,0 +1,85 @@
 #!/bin/bash
 # Linux Diagnostic Agent Installation Script
 # This script installs the nanny-agent on a Linux system
 set -e
 echo "🔧 Linux Diagnostic Agent Installation Script"
 echo "=============================================="
 # Check if Go is installed
 if ! command -v go &> /dev/null; then
    echo "❌ Go is not installed. Please install Go first:"
    echo ""
    echo "For Ubuntu/Debian:"
    echo "  sudo apt update && sudo apt install golang-go"
    echo ""
    echo "For RHEL/CentOS/Fedora:"
    echo "  sudo dnf install golang"
    echo "  # or"
    echo "  sudo yum install golang"
    echo ""
    exit 1
 fi
 echo "✅ Go is installed: $(go version)"
 # Build the application
 echo "🔨 Building the application..."
 go mod tidy
 make build
 # Check if build was successful
 if [ ! -f "./nanny-agent" ]; then
    echo "❌ Build failed! nanny-agent binary not found."
    exit 1
 fi
 echo "✅ Build successful!"
 # Ask for installation preference
 echo ""
 echo "Installation options:"
 echo "1. Install system-wide (/usr/local/bin) - requires sudo"
 echo "2. Keep in current directory"
 echo ""
 read -p "Choose option (1 or 2): " choice
 case $choice in
    1)
        echo "📦 Installing system-wide..."
        sudo cp nanny-agent /usr/local/bin/
        sudo chmod +x /usr/local/bin/nanny-agent
        echo "✅ Agent installed to /usr/local/bin/nanny-agent"
        echo ""
        echo "You can now run the agent from anywhere with:"
        echo "  nanny-agent"
        ;;
    2)
        echo "✅ Agent ready in current directory"
        echo ""
        echo "Run the agent with:"
        echo "  ./nanny-agent"
        ;;
    *)
        echo "❌ Invalid choice. Agent is available in current directory."
        echo "Run with: ./nanny-agent"
        ;;
 esac
 # Configuration
 echo ""
 echo "📝 Configuration:"
 echo "Set these environment variables to configure the agent:"
 echo ""
 echo "export NANNYAPI_ENDPOINT=\"http://your-nannyapi-host:3000/openai/v1\""
 echo "export NANNYAPI_MODEL=\"your-model-identifier\""
 echo ""
 echo "Or create a .env file in the working directory."
 echo ""
 echo "🎉 Installation complete!"
 echo ""
 echo "Example usage:"
 echo "  ./nanny-agent"
 echo "  > On /var filesystem I cannot create any file but df -h shows 30% free space available."
--- a/integration-tests.sh
+++ b/integration-tests.sh
@@ -0,0 +1,116 @@
 #!/bin/bash
 # Linux Diagnostic Agent - Integration Tests
 # This script creates realistic Linux problem scenarios for testing
 set -e
 AGENT_BINARY="./nanny-agent"
 TEST_DIR="/tmp/nanny-agent-tests"
 TEST_LOG="$TEST_DIR/integration_test.log"
 # Color codes for output
 RED='\033[0;31m'
 GREEN='\033[0;32m'
 YELLOW='\033[1;33m'
 BLUE='\033[0;34m'
 NC='\033[0m' # No Color
 # Ensure test directory exists
 mkdir -p "$TEST_DIR"
 echo -e "${BLUE}🧪 Linux Diagnostic Agent - Integration Tests${NC}"
 echo "================================================="
 echo ""
 # Check if agent binary exists
 if [[ ! -f "$AGENT_BINARY" ]]; then
    echo -e "${RED}❌ Agent binary not found at $AGENT_BINARY${NC}"
    echo "Please run: make build"
    exit 1
 fi
 # Function to run a test scenario
 run_test() {
    local test_name="$1"
    local scenario="$2"
    local expected_keywords="$3"
    echo -e "${YELLOW}📋 Test: $test_name${NC}"
    echo "Scenario: $scenario"
    echo ""
    # Run the agent with the scenario
    echo "$scenario" | timeout 120s "$AGENT_BINARY" > "$TEST_LOG" 2>&1 || true
    # Check if any expected keywords are found in the output
    local found_keywords=0
    IFS=',' read -ra KEYWORDS <<< "$expected_keywords"
    for keyword in "${KEYWORDS[@]}"; do
        keyword=$(echo "$keyword" | xargs) # trim whitespace
        if grep -qi "$keyword" "$TEST_LOG"; then
            echo -e "${GREEN}  ✅ Found expected keyword: $keyword${NC}"
            ((found_keywords++))
        else
            echo -e "${RED}  ❌ Missing keyword: $keyword${NC}"
        fi
    done
    # Show summary
    if [[ $found_keywords -gt 0 ]]; then
        echo -e "${GREEN}  ✅ Test PASSED ($found_keywords keywords found)${NC}"
    else
        echo -e "${RED}  ❌ Test FAILED (no expected keywords found)${NC}"
    fi
    echo ""
    echo "Full output saved to: $TEST_LOG"
    echo "----------------------------------------"
    echo ""
 }
 # Test Scenario 1: Disk Space Issues (Inode Exhaustion)
 run_test "Disk Space - Inode Exhaustion" \
    "I cannot create new files in /home directory even though df -h shows plenty of space available. Getting 'No space left on device' error when trying to touch new files." \
    "inode,df -i,filesystem,inodes,exhausted"
 # Test Scenario 2: Memory Issues
 run_test "Memory Issues - OOM Killer" \
    "My applications keep getting killed randomly and I see 'killed' messages in logs. The system becomes unresponsive for a few seconds before recovering. This happens especially when running memory-intensive tasks." \
    "memory,oom,killed,dmesg,free,swap"
 # Test Scenario 3: Network Connectivity Issues
 run_test "Network Connectivity - DNS Resolution" \
    "I can ping IP addresses directly (like 8.8.8.8) but cannot resolve domain names. Web browsing fails with DNS resolution errors, but ping 8.8.8.8 works fine." \
    "dns,resolv.conf,nslookup,nameserver,dig"
 # Test Scenario 4: Service/Process Issues
 run_test "Service Issues - High Load" \
    "System load average is consistently above 10.0 even when CPU usage appears normal. Applications are responding slowly and I notice high wait times. The server feels sluggish overall." \
    "load,average,cpu,iostat,vmstat,processes"
 # Test Scenario 5: File System Issues
 run_test "Filesystem Issues - Permission Problems" \
    "Web server returns 403 Forbidden errors for all pages. Files exist and seem readable, but nginx logs show permission denied errors. SELinux is disabled and file permissions look correct." \
    "permission,403,nginx,chmod,chown,selinux"
 # Test Scenario 6: Boot/System Issues
 run_test "Boot Issues - Kernel Module" \
    "System boots but some hardware devices are not working. Network interface shows as down, USB devices are not recognized, and dmesg shows module loading failures." \
    "module,lsmod,dmesg,hardware,interface,usb"
 # Test Scenario 7: Performance Issues
 run_test "Performance Issues - I/O Bottleneck" \
    "Database queries are extremely slow, taking 30+ seconds for simple SELECT statements. Disk activity LED is constantly on and system feels unresponsive during database operations." \
    "iostat,iotop,disk,database,slow,performance"
 echo -e "${BLUE}🏁 Integration Tests Complete${NC}"
 echo ""
 echo "Check individual test logs in: $TEST_DIR"
 echo ""
 echo -e "${YELLOW}💡 Tips:${NC}"
 echo "- Tests use realistic scenarios that could occur on production systems"
 echo "- Each test expects the AI to suggest relevant diagnostic commands"
 echo "- Review the full logs to see the complete diagnostic conversation"
 echo "- Tests timeout after 120 seconds to prevent hanging"
 echo "- Make sure NANNYAPI_ENDPOINT and NANNYAPI_MODEL are set correctly"
--- a/main.go
+++ b/main.go
@@ -0,0 +1,46 @@
 package main
 import (
 	"bufio"
 	"fmt"
 	"log"
 	"os"
 	"strings"
 )
 func main() {
 	// Initialize the agent
 	agent := NewLinuxDiagnosticAgent()
 	// Start the interactive session
 	fmt.Println("Linux Diagnostic Agent Started")
 	fmt.Println("Enter a system issue description (or 'quit' to exit):")
 	scanner := bufio.NewScanner(os.Stdin)
 	for {
 		fmt.Print("> ")
 		if !scanner.Scan() {
 			break
 		}
 		input := strings.TrimSpace(scanner.Text())
 		if input == "quit" || input == "exit" {
 			break
 		}
 		if input == "" {
 			continue
 		}
 		// Process the issue
 		if err := agent.DiagnoseIssue(input); err != nil {
 			fmt.Printf("Error: %v\n", err)
 		}
 	}
 	if err := scanner.Err(); err != nil {
 		log.Fatal(err)
 	}
 	fmt.Println("Goodbye!")
 }
--- a/system_info.go
+++ b/system_info.go
@@ -0,0 +1,154 @@
 package main
 import (
 	"fmt"
 	"net"
 	"runtime"
 	"strings"
 	"time"
 )
 // SystemInfo represents basic system information
 type SystemInfo struct {
 	Hostname     string `json:"hostname"`
 	OS           string `json:"os"`
 	Kernel       string `json:"kernel"`
 	Architecture string `json:"architecture"`
 	CPUCores     string `json:"cpu_cores"`
 	Memory       string `json:"memory"`
 	Uptime       string `json:"uptime"`
 	PrivateIPs   string `json:"private_ips"`
 	LoadAverage  string `json:"load_average"`
 	DiskUsage    string `json:"disk_usage"`
 }
 // GatherSystemInfo collects basic system information
 func GatherSystemInfo() *SystemInfo {
 	info := &SystemInfo{}
 	executor := NewCommandExecutor(5 * time.Second)
 	// Basic system info
 	if result := executor.Execute(Command{ID: "hostname", Command: "hostname"}); result.ExitCode == 0 {
 		info.Hostname = strings.TrimSpace(result.Output)
 	}
 	if result := executor.Execute(Command{ID: "os", Command: "lsb_release -d 2>/dev/null | cut -f2 || cat /etc/os-release | grep PRETTY_NAME | cut -d'=' -f2 | tr -d '\"'"}); result.ExitCode == 0 {
 		info.OS = strings.TrimSpace(result.Output)
 	}
 	if result := executor.Execute(Command{ID: "kernel", Command: "uname -r"}); result.ExitCode == 0 {
 		info.Kernel = strings.TrimSpace(result.Output)
 	}
 	if result := executor.Execute(Command{ID: "arch", Command: "uname -m"}); result.ExitCode == 0 {
 		info.Architecture = strings.TrimSpace(result.Output)
 	}
 	if result := executor.Execute(Command{ID: "cores", Command: "nproc"}); result.ExitCode == 0 {
 		info.CPUCores = strings.TrimSpace(result.Output)
 	}
 	if result := executor.Execute(Command{ID: "memory", Command: "free -h | grep Mem | awk '{print $2}'"}); result.ExitCode == 0 {
 		info.Memory = strings.TrimSpace(result.Output)
 	}
 	if result := executor.Execute(Command{ID: "uptime", Command: "uptime -p"}); result.ExitCode == 0 {
 		info.Uptime = strings.TrimSpace(result.Output)
 	}
 	if result := executor.Execute(Command{ID: "load", Command: "uptime | awk -F'load average:' '{print $2}' | xargs"}); result.ExitCode == 0 {
 		info.LoadAverage = strings.TrimSpace(result.Output)
 	}
 	if result := executor.Execute(Command{ID: "disk", Command: "df -h / | tail -1 | awk '{print \"Root: \" $3 \"/\" $2 \" (\" $5 \" used)\"}'"}); result.ExitCode == 0 {
 		info.DiskUsage = strings.TrimSpace(result.Output)
 	}
 	// Get private IP addresses
 	info.PrivateIPs = getPrivateIPs()
 	return info
 }
 // getPrivateIPs returns private IP addresses
 func getPrivateIPs() string {
 	var privateIPs []string
 	interfaces, err := net.Interfaces()
 	if err != nil {
 		return "Unable to determine"
 	}
 	for _, iface := range interfaces {
 		if iface.Flags&net.FlagUp == 0 || iface.Flags&net.FlagLoopback != 0 {
 			continue // Skip down or loopback interfaces
 		}
 		addrs, err := iface.Addrs()
 		if err != nil {
 			continue
 		}
 		for _, addr := range addrs {
 			if ipnet, ok := addr.(*net.IPNet); ok && !ipnet.IP.IsLoopback() {
 				if isPrivateIP(ipnet.IP) {
 					privateIPs = append(privateIPs, fmt.Sprintf("%s (%s)", ipnet.IP.String(), iface.Name))
 				}
 			}
 		}
 	}
 	if len(privateIPs) == 0 {
 		return "No private IPs found"
 	}
 	return strings.Join(privateIPs, ", ")
 }
 // isPrivateIP checks if an IP address is private
 func isPrivateIP(ip net.IP) bool {
 	// RFC 1918 private address ranges
 	private := []string{
 		"10.0.0.0/8",
 		"172.16.0.0/12",
 		"192.168.0.0/16",
 	}
 	for _, cidr := range private {
 		_, subnet, _ := net.ParseCIDR(cidr)
 		if subnet.Contains(ip) {
 			return true
 		}
 	}
 	return false
 }
 // FormatSystemInfoForPrompt formats system information for inclusion in diagnostic prompts
 func FormatSystemInfoForPrompt(info *SystemInfo) string {
 	return fmt.Sprintf(`SYSTEM INFORMATION:
 - Hostname: %s
 - Operating System: %s
 - Kernel Version: %s
 - Architecture: %s
 - CPU Cores: %s
 - Total Memory: %s
 - System Uptime: %s
 - Current Load Average: %s
 - Root Disk Usage: %s
 - Private IP Addresses: %s
 - Go Runtime: %s
 ISSUE DESCRIPTION:`,
 		info.Hostname,
 		info.OS,
 		info.Kernel,
 		info.Architecture,
 		info.CPUCores,
 		info.Memory,
 		info.Uptime,
 		info.LoadAverage,
 		info.DiskUsage,
 		info.PrivateIPs,
 		runtime.Version())
 }
--- a/test-examples.sh
+++ b/test-examples.sh
@@ -0,0 +1,82 @@
 #!/bin/bash
 # Linux Diagnostic Agent - Test Scenarios
 # Realistic Linux problems for testing the diagnostic agent
 echo "🔧 Linux Diagnostic Agent - Test Scenarios"
 echo "==========================================="
 echo ""
 echo "📚 Available test scenarios (copy-paste into the agent):"
 echo ""
 echo "1. 💾 DISK SPACE ISSUES (Inode Exhaustion):"
 echo "────────────────────────────────────────────"
 echo "I cannot create new files in /home directory even though df -h shows plenty of space available. Getting 'No space left on device' error when trying to touch new files."
 echo ""
 echo "2. 🧠 MEMORY ISSUES (OOM Killer):"
 echo "─────────────────────────────────"
 echo "My applications keep getting killed randomly and I see 'killed' messages in logs. The system becomes unresponsive for a few seconds before recovering. This happens especially when running memory-intensive tasks."
 echo ""
 echo "3. 🌐 NETWORK CONNECTIVITY (DNS Resolution):"
 echo "─────────────────────────────────────────────"
 echo "I can ping IP addresses directly (like 8.8.8.8) but cannot resolve domain names. Web browsing fails with DNS resolution errors, but ping 8.8.8.8 works fine."
 echo ""
 echo "4. ⚡ PERFORMANCE ISSUES (High Load):"
 echo "───────────────────────────────────"
 echo "System load average is consistently above 10.0 even when CPU usage appears normal. Applications are responding slowly and I notice high wait times. The server feels sluggish overall."
 echo ""
 echo "5. 🚫 WEB SERVER ISSUES (Permission Problems):"
 echo "──────────────────────────────────────────────"
 echo "Web server returns 403 Forbidden errors for all pages. Files exist and seem readable, but nginx logs show permission denied errors. SELinux is disabled and file permissions look correct."
 echo ""
 echo "6. 🖥️  HARDWARE/BOOT ISSUES (Kernel Module):"
 echo "─────────────────────────────────────────────"
 echo "System boots but some hardware devices are not working. Network interface shows as down, USB devices are not recognized, and dmesg shows module loading failures."
 echo ""
 echo "7. 🐌 DATABASE PERFORMANCE (I/O Bottleneck):"
 echo "─────────────────────────────────────────────"
 echo "Database queries are extremely slow, taking 30+ seconds for simple SELECT statements. Disk activity LED is constantly on and system feels unresponsive during database operations."
 echo ""
 echo "8. 🔥 HIGH CPU USAGE (Process Analysis):"
 echo "────────────────────────────────────────"
 echo "System is running slow and CPU usage is constantly at 100%. Top shows high CPU usage but I can't identify which specific process or thread is causing the issue."
 echo ""
 echo "9. 📁 FILE SYSTEM CORRUPTION:"
 echo "────────────────────────────"
 echo "Getting 'Input/output error' when accessing certain files and directories. Some files appear corrupted and applications crash when trying to read specific data files."
 echo ""
 echo "10. 🔌 SERVICE STARTUP FAILURES:"
 echo "───────────────────────────────"
 echo "Critical services fail to start after system reboot. Systemctl shows services in failed state but error messages are unclear. System appears to boot normally otherwise."
 echo ""
 echo "🚀 Quick Start:"
 echo "──────────────"
 echo "1. Run: ./nanny-agent"
 echo "2. Copy-paste any scenario above when prompted"
 echo "3. Watch the AI diagnose the problem step by step"
 echo ""
 echo "🧪 Automated Testing:"
 echo "────────────────────"
 echo "Run integration tests: ./integration-tests.sh"
 echo "This will test all scenarios automatically"
 echo ""
 echo "💡 Pro Tips:"
 echo "───────────"
 echo "- Each scenario is based on real-world Linux issues"
 echo "- The AI will gather system info automatically"
 echo "- Diagnostic commands are executed safely (read-only)"
 echo "- You'll get a detailed resolution plan at the end"
 echo "- Set NANNYAPI_ENDPOINT and NANNYAPI_MODEL before running"
		`@@ -0,0 +1,2 @@`
							`github.com/sashabaranov/go-openai v1.32.0 h1:Yk3iE9moX3RBXxrof3OBtUBrE7qZR0zF9ebsoO4zVzI=`
							`github.com/sashabaranov/go-openai v1.32.0/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=`