Initial Commit

2025-09-27 17:35:24 +02:00
parent 83fa088ed2
commit 1f01c38881
14 changed files with 1277 additions and 2 deletions
--- a/0
+++ b/0
--- a/53
+++ b/53
@@ -0,0 +1,53 @@
+.PHONY: build run clean test install
+
+# Build the application
+build:
+	go build -o nanny-agent .
+
+# Run the application
+run: build
+	./nanny-agent
+
+# Clean build artifacts
+clean:
+	rm -f nanny-agent
+
+# Run tests
+test:
+	go test ./...
+
+# Install dependencies
+install:
+	go mod tidy
+	go mod download
+
+# Build for production with optimizations
+build-prod:
+	CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-w -s' -o nanny-agent .
+
+# Install system-wide (requires sudo)
+install-system: build-prod
+	sudo cp nanny-agent /usr/local/bin/
+	sudo chmod +x /usr/local/bin/nanny-agent
+
+# Format code
+fmt:
+	go fmt ./...
+
+# Run linter (if golangci-lint is installed)
+lint:
+	golangci-lint run
+
+# Show help
+help:
+	@echo "Available commands:"
+	@echo "  build         - Build the application"
+	@echo "  run           - Build and run the application"
+	@echo "  clean         - Clean build artifacts"
+	@echo "  test          - Run tests"
+	@echo "  install       - Install dependencies"
+	@echo "  build-prod    - Build for production"
+	@echo "  install-system- Install system-wide (requires sudo)"
+	@echo "  fmt           - Format code"
+	@echo "  lint          - Run linter"
+	@echo "  help          - Show this help"
--- a/README.md
+++ b/README.md
@@ -1,3 +1,199 @@
-# nannyagent
+# Linux Diagnostic Agent

-nannyagent is a Linux AI diagnostic agent built on OpenAPI specifications relying on Tensorzero gateway
+A Go-based AI agent that diagnoses Linux system issues using the NannyAPI gateway with OpenAI-compatible SDK.
+
+## Features
+
+- Interactive command-line interface for submitting system issues
+- **Automatic system information gathering** - Includes OS, kernel, CPU, memory, network info
+- Integrates with NannyAPI using OpenAI-compatible Go SDK
+- Executes diagnostic commands safely and collects output
+- Provides step-by-step resolution plans
+- **Comprehensive integration tests** with realistic Linux problem scenarios
+
+## Setup
+
+1. Clone this repository
+2. Copy `.env.example` to `.env` and configure your NannyAPI endpoint:
+   ```bash
+   cp .env.example .env
+   ```
+3. Install dependencies:
+   ```bash
+   go mod tidy
+   ```
+4. Build and run:
+   ```bash
+   make build
+   ./nanny-agent
+   ```
+
+## Configuration
+
+The agent can be configured using environment variables:
+
+- `NANNYAPI_ENDPOINT`: The NannyAPI endpoint (default: `http://nannyapi.local:3000/openai/v1`)
+- `NANNYAPI_MODEL`: The model identifier (default: `nannyapi::function_name::diagnose_and_heal`)
+
+## Installation on Linux VM
+
+### Direct Installation
+
+1. **Install Go** (if not already installed):
+   ```bash
+   # For Ubuntu/Debian
+   sudo apt update
+   sudo apt install golang-go
+
+   # For RHEL/CentOS/Fedora
+   sudo dnf install golang
+   # or
+   sudo yum install golang
+   ```
+
+2. **Clone and build the agent**:
+   ```bash
+   git clone <your-repo-url>
+   cd nannyagentv2
+   go mod tidy
+   make build
+   ```
+
+3. **Install as system service** (optional):
+   ```bash
+   sudo cp nanny-agent /usr/local/bin/
+   sudo chmod +x /usr/local/bin/nanny-agent
+   ```
+
+4. **Set environment variables**:
+   ```bash
+   export NANNYAPI_ENDPOINT="http://your-nannyapi-endpoint:3000/openai/v1"
+   export NANNYAPI_MODEL="your-model-identifier"
+   ```
+
+## Usage
+
+1. Start the agent:
+   ```bash
+   ./nanny-agent
+   ```
+
+2. Enter a system issue description when prompted:
+   ```
+   > On /var filesystem I cannot create any file but df -h shows 30% free space available.
+   ```
+
+3. The agent will:
+   - Send the issue to the AI via NannyAPI using OpenAI SDK
+   - Execute diagnostic commands as suggested by the AI
+   - Provide command outputs back to the AI
+   - Display the final diagnosis and resolution plan
+
+4. Type `quit` or `exit` to stop the agent
+
+## How It Works
+
+1. **System Information Gathering**: Agent automatically collects system details (OS, kernel, CPU, memory, network, etc.)
+2. **Initial Issue**: User describes a Linux system problem
+3. **Enhanced Prompt**: AI receives both the issue description and comprehensive system information
+4. **Diagnostic Phase**: AI responds with diagnostic commands to run
+5. **Command Execution**: Agent safely executes read-only commands
+6. **Iterative Analysis**: AI analyzes command outputs and may request more commands
+7. **Resolution Phase**: AI provides root cause analysis and step-by-step resolution plan
+
+## Testing & Integration Tests
+
+The agent includes comprehensive integration tests that simulate realistic Linux problems:
+
+### Available Test Scenarios:
+1. **Disk Space Issues** - Inode exhaustion scenarios
+2. **Memory Problems** - OOM killer and memory pressure
+3. **Network Issues** - DNS resolution problems
+4. **Performance Issues** - High load averages and I/O bottlenecks
+5. **Web Server Problems** - Permission and configuration issues
+6. **Hardware/Boot Issues** - Kernel module and device problems
+7. **Database Performance** - Slow queries and I/O contention
+8. **Service Failures** - Startup and configuration problems
+
+### Run Integration Tests:
+```bash
+# Interactive test scenarios
+./test-examples.sh
+
+# Automated integration tests
+./integration-tests.sh
+
+# Function discovery (find valid NannyAPI functions)
+./discover-functions.sh
+```
+
+## Safety
+
+- Only read-only commands are executed automatically
+- Commands that modify the system (rm, mv, dd, redirection) are blocked by validation
+- The resolution plan is provided for manual execution by the operator
+- All commands have execution timeouts to prevent hanging
+
+## API Integration
+
+The agent uses the `github.com/sashabaranov/go-openai` SDK to communicate with NannyAPI's OpenAI-compatible API endpoint. This provides:
+
+- Robust HTTP client with retries and timeouts
+- Structured request/response handling
+- Automatic JSON marshaling/unmarshaling
+- Error handling and validation
+
+## Example Session
+
+```
+Linux Diagnostic Agent Started
+Enter a system issue description (or 'quit' to exit):
+> Cannot create files in /var but df shows space available
+
+Diagnosing issue: Cannot create files in /var but df shows space available
+Gathering system information...
+
+AI Response:
+{
+  "response_type": "diagnostic",
+  "reasoning": "The 'No space left on device' error despite available disk space suggests inode exhaustion...",
+  "commands": [
+    {"id": "check_inodes", "command": "df -i /var", "description": "Check inode usage..."}
+  ]
+}
+
+Executing command 'check_inodes': df -i /var
+Output:
+Filesystem      Inodes   IUsed   IFree IUse% Mounted on
+/dev/sda1      1000000  999999       1  100% /var
+
+=== DIAGNOSIS COMPLETE ===
+Root Cause: The /var filesystem has exhausted all available inodes
+Resolution Plan: 1. Find and remove unnecessary files...
+Confidence: High
+```
+
+Note: The AI receives comprehensive system information including:
+- Hostname, OS version, kernel version
+- CPU cores, memory, system uptime
+- Network interfaces and private IPs
+- Current load average and disk usage
+
+## Available Make Commands
+
+- `make build` - Build the application
+- `make run` - Build and run the application  
+- `make clean` - Clean build artifacts
+- `make test` - Run unit tests
+- `make install` - Install dependencies
+- `make build-prod` - Build for production
+- `make install-system` - Install system-wide (requires sudo)
+- `make fmt` - Format code
+- `make help` - Show available commands
+
+## Testing Commands
+
+- `./test-examples.sh` - Show interactive test scenarios
+- `./integration-tests.sh` - Run automated integration tests
+- `./discover-functions.sh` - Find available NannyAPI functions
+- `./install.sh` - Installation script for Linux VMs
--- a/agent.go
+++ b/agent.go
@@ -0,0 +1,270 @@
+package main
+
+import (
+	"bytes"
+	"context"
+	"encoding/json"
+	"fmt"
+	"io"
+	"net/http"
+	"os"
+	"time"
+
+	"github.com/sashabaranov/go-openai"
+)
+
+// DiagnosticResponse represents the diagnostic phase response from AI
+type DiagnosticResponse struct {
+	ResponseType string    `json:"response_type"`
+	Reasoning    string    `json:"reasoning"`
+	Commands     []Command `json:"commands"`
+}
+
+// ResolutionResponse represents the resolution phase response from AI
+type ResolutionResponse struct {
+	ResponseType   string `json:"response_type"`
+	RootCause      string `json:"root_cause"`
+	ResolutionPlan string `json:"resolution_plan"`
+	Confidence     string `json:"confidence"`
+}
+
+// Command represents a command to be executed
+type Command struct {
+	ID          string `json:"id"`
+	Command     string `json:"command"`
+	Description string `json:"description"`
+}
+
+// CommandResult represents the result of executing a command
+type CommandResult struct {
+	ID       string `json:"id"`
+	Command  string `json:"command"`
+	Output   string `json:"output"`
+	ExitCode int    `json:"exit_code"`
+	Error    string `json:"error,omitempty"`
+}
+
+// LinuxDiagnosticAgent represents the main agent
+type LinuxDiagnosticAgent struct {
+	client    *openai.Client
+	model     string
+	executor  *CommandExecutor
+	episodeID string // TensorZero episode ID for conversation continuity
+}
+
+// NewLinuxDiagnosticAgent creates a new diagnostic agent
+func NewLinuxDiagnosticAgent() *LinuxDiagnosticAgent {
+	endpoint := os.Getenv("NANNYAPI_ENDPOINT")
+	if endpoint == "" {
+		// Default endpoint - OpenAI SDK will append /chat/completions automatically
+		endpoint = "http://nannyapi.local:3000/openai/v1"
+	}
+
+	model := os.Getenv("NANNYAPI_MODEL")
+	if model == "" {
+		model = "nannyapi::function_name::diagnose_and_heal"
+		fmt.Printf("Warning: Using default model '%s'. Set NANNYAPI_MODEL environment variable for your specific function.\n", model)
+	}
+
+	// Create OpenAI client with custom base URL
+	// Note: The OpenAI SDK automatically appends "/chat/completions" to the base URL
+	config := openai.DefaultConfig("")
+	config.BaseURL = endpoint
+	client := openai.NewClientWithConfig(config)
+
+	return &LinuxDiagnosticAgent{
+		client:   client,
+		model:    model,
+		executor: NewCommandExecutor(10 * time.Second), // 10 second timeout for commands
+	}
+}
+
+// DiagnoseIssue starts the diagnostic process for a given issue
+func (a *LinuxDiagnosticAgent) DiagnoseIssue(issue string) error {
+	fmt.Printf("Diagnosing issue: %s\n", issue)
+	fmt.Println("Gathering system information...")
+
+	// Gather system information
+	systemInfo := GatherSystemInfo()
+
+	// Format the initial prompt with system information
+	initialPrompt := FormatSystemInfoForPrompt(systemInfo) + "\n" + issue
+
+	// Start conversation with initial issue including system info
+	messages := []openai.ChatCompletionMessage{
+		{
+			Role:    openai.ChatMessageRoleUser,
+			Content: initialPrompt,
+		},
+	}
+
+	for {
+		// Send request to TensorZero API via OpenAI SDK
+		response, err := a.sendRequest(messages)
+		if err != nil {
+			return fmt.Errorf("failed to send request: %w", err)
+		}
+
+		if len(response.Choices) == 0 {
+			return fmt.Errorf("no choices in response")
+		}
+
+		content := response.Choices[0].Message.Content
+		fmt.Printf("\nAI Response:\n%s\n", content)
+
+		// Parse the response to determine next action
+		var diagnosticResp DiagnosticResponse
+		var resolutionResp ResolutionResponse
+
+		// Try to parse as diagnostic response first
+		if err := json.Unmarshal([]byte(content), &diagnosticResp); err == nil && diagnosticResp.ResponseType == "diagnostic" {
+			// Handle diagnostic phase
+			fmt.Printf("\nReasoning: %s\n", diagnosticResp.Reasoning)
+
+			if len(diagnosticResp.Commands) == 0 {
+				fmt.Println("No commands to execute in diagnostic phase")
+				break
+			}
+
+			// Execute commands and collect results
+			commandResults := make([]CommandResult, 0, len(diagnosticResp.Commands))
+			for _, cmd := range diagnosticResp.Commands {
+				fmt.Printf("\nExecuting command '%s': %s\n", cmd.ID, cmd.Command)
+				result := a.executor.Execute(cmd)
+				commandResults = append(commandResults, result)
+
+				fmt.Printf("Output:\n%s\n", result.Output)
+				if result.Error != "" {
+					fmt.Printf("Error: %s\n", result.Error)
+				}
+			}
+
+			// Prepare command results as user message
+			resultsJSON, err := json.MarshalIndent(commandResults, "", "  ")
+			if err != nil {
+				return fmt.Errorf("failed to marshal command results: %w", err)
+			}
+
+			// Add AI response and command results to conversation
+			messages = append(messages, openai.ChatCompletionMessage{
+				Role:    openai.ChatMessageRoleAssistant,
+				Content: content,
+			})
+			messages = append(messages, openai.ChatCompletionMessage{
+				Role:    openai.ChatMessageRoleUser,
+				Content: string(resultsJSON),
+			})
+
+			continue
+		}
+
+		// Try to parse as resolution response
+		if err := json.Unmarshal([]byte(content), &resolutionResp); err == nil && resolutionResp.ResponseType == "resolution" {
+			// Handle resolution phase
+			fmt.Printf("\n=== DIAGNOSIS COMPLETE ===\n")
+			fmt.Printf("Root Cause: %s\n", resolutionResp.RootCause)
+			fmt.Printf("Resolution Plan: %s\n", resolutionResp.ResolutionPlan)
+			fmt.Printf("Confidence: %s\n", resolutionResp.Confidence)
+			break
+		}
+
+		// If we can't parse the response, treat it as an error or unexpected format
+		fmt.Printf("Unexpected response format or error from AI:\n%s\n", content)
+		break
+	}
+
+	return nil
+}
+
+// TensorZeroRequest represents a request structure compatible with TensorZero's episode_id
+type TensorZeroRequest struct {
+	Model     string                         `json:"model"`
+	Messages  []openai.ChatCompletionMessage `json:"messages"`
+	EpisodeID string                         `json:"tensorzero::episode_id,omitempty"`
+}
+
+// TensorZeroResponse represents TensorZero's response with episode_id
+type TensorZeroResponse struct {
+	openai.ChatCompletionResponse
+	EpisodeID string `json:"episode_id"`
+}
+
+// sendRequest sends a request to the TensorZero API with tensorzero::episode_id support
+func (a *LinuxDiagnosticAgent) sendRequest(messages []openai.ChatCompletionMessage) (*openai.ChatCompletionResponse, error) {
+	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+	defer cancel()
+
+	// Create TensorZero-compatible request
+	tzRequest := TensorZeroRequest{
+		Model:    a.model,
+		Messages: messages,
+	}
+
+	// Include tensorzero::episode_id for conversation continuity (if we have one)
+	if a.episodeID != "" {
+		tzRequest.EpisodeID = a.episodeID
+	}
+
+	fmt.Printf("Debug: Sending request to model: %s", a.model)
+	if a.episodeID != "" {
+		fmt.Printf(" (episode: %s)", a.episodeID)
+	}
+	fmt.Println()
+
+	// Marshal the request
+	requestBody, err := json.Marshal(tzRequest)
+	if err != nil {
+		return nil, fmt.Errorf("failed to marshal request: %w", err)
+	}
+
+	// Create HTTP request
+	endpoint := os.Getenv("NANNYAPI_ENDPOINT")
+	if endpoint == "" {
+		endpoint = "http://nannyapi.local:3000/openai/v1"
+	}
+
+	// Ensure the endpoint ends with /chat/completions
+	if endpoint[len(endpoint)-1] != '/' {
+		endpoint += "/"
+	}
+	endpoint += "chat/completions"
+
+	req, err := http.NewRequestWithContext(ctx, "POST", endpoint, bytes.NewBuffer(requestBody))
+	if err != nil {
+		return nil, fmt.Errorf("failed to create request: %w", err)
+	}
+
+	req.Header.Set("Content-Type", "application/json")
+
+	// Make the request
+	client := &http.Client{Timeout: 30 * time.Second}
+	resp, err := client.Do(req)
+	if err != nil {
+		return nil, fmt.Errorf("failed to send request: %w", err)
+	}
+	defer resp.Body.Close()
+
+	// Read response body
+	body, err := io.ReadAll(resp.Body)
+	if err != nil {
+		return nil, fmt.Errorf("failed to read response: %w", err)
+	}
+
+	if resp.StatusCode != http.StatusOK {
+		return nil, fmt.Errorf("API request failed with status %d: %s", resp.StatusCode, string(body))
+	}
+
+	// Parse TensorZero response
+	var tzResponse TensorZeroResponse
+	if err := json.Unmarshal(body, &tzResponse); err != nil {
+		return nil, fmt.Errorf("failed to unmarshal response: %w", err)
+	}
+
+	// Extract episode_id from first response
+	if a.episodeID == "" && tzResponse.EpisodeID != "" {
+		a.episodeID = tzResponse.EpisodeID
+		fmt.Printf("Debug: Extracted episode ID: %s\n", a.episodeID)
+	}
+
+	return &tzResponse.ChatCompletionResponse, nil
+}
--- a/agent_test.go
+++ b/agent_test.go
@@ -0,0 +1,107 @@
+package main
+
+import (
+	"testing"
+	"time"
+)
+
+func TestCommandExecutor_ValidateCommand(t *testing.T) {
+	executor := NewCommandExecutor(5 * time.Second)
+
+	tests := []struct {
+		name    string
+		command string
+		wantErr bool
+	}{
+		{
+			name:    "safe command - ls",
+			command: "ls -la /var",
+			wantErr: false,
+		},
+		{
+			name:    "safe command - df",
+			command: "df -h",
+			wantErr: false,
+		},
+		{
+			name:    "safe command - ps",
+			command: "ps aux | grep nginx",
+			wantErr: false,
+		},
+		{
+			name:    "dangerous command - rm",
+			command: "rm -rf /tmp/*",
+			wantErr: true,
+		},
+		{
+			name:    "dangerous command - dd",
+			command: "dd if=/dev/zero of=/dev/sda",
+			wantErr: true,
+		},
+		{
+			name:    "dangerous command - sudo",
+			command: "sudo systemctl stop nginx",
+			wantErr: true,
+		},
+		{
+			name:    "dangerous command - redirection",
+			command: "echo 'test' > /etc/passwd",
+			wantErr: true,
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			err := executor.validateCommand(tt.command)
+			if (err != nil) != tt.wantErr {
+				t.Errorf("validateCommand() error = %v, wantErr %v", err, tt.wantErr)
+			}
+		})
+	}
+}
+
+func TestCommandExecutor_Execute(t *testing.T) {
+	executor := NewCommandExecutor(5 * time.Second)
+
+	// Test safe command execution
+	cmd := Command{
+		ID:          "test_echo",
+		Command:     "echo 'Hello, World!'",
+		Description: "Test echo command",
+	}
+
+	result := executor.Execute(cmd)
+
+	if result.ExitCode != 0 {
+		t.Errorf("Expected exit code 0, got %d", result.ExitCode)
+	}
+
+	if result.Output != "Hello, World!\n" {
+		t.Errorf("Expected 'Hello, World!\\n', got '%s'", result.Output)
+	}
+
+	if result.Error != "" {
+		t.Errorf("Expected no error, got '%s'", result.Error)
+	}
+}
+
+func TestCommandExecutor_ExecuteUnsafeCommand(t *testing.T) {
+	executor := NewCommandExecutor(5 * time.Second)
+
+	// Test unsafe command rejection
+	cmd := Command{
+		ID:          "test_rm",
+		Command:     "rm -rf /tmp/test",
+		Description: "Dangerous rm command",
+	}
+
+	result := executor.Execute(cmd)
+
+	if result.ExitCode != 1 {
+		t.Errorf("Expected exit code 1 for unsafe command, got %d", result.ExitCode)
+	}
+
+	if result.Error == "" {
+		t.Error("Expected error for unsafe command, got none")
+	}
+}
--- a/discover-functions.sh
+++ b/discover-functions.sh
@@ -0,0 +1,51 @@
+#!/bin/bash
+
+# NannyAPI Function Discovery Script
+# This script helps you find the correct function name for your NannyAPI setup
+
+echo "🔍 NannyAPI Function Discovery"
+echo "=============================="
+echo ""
+
+ENDPOINT="${NANNYAPI_ENDPOINT:-http://nannyapi.local:3000/openai/v1}"
+
+echo "Testing endpoint: $ENDPOINT/chat/completions"
+echo ""
+
+# Test common function name patterns
+test_functions=(
+    "nannyapi::function_name::diagnose"
+    "nannyapi::function_name::diagnose_and_heal"
+    "nannyapi::function_name::linux_diagnostic"
+    "nannyapi::function_name::system_diagnostic"
+    "nannyapi::model_name::gpt-4"
+    "nannyapi::model_name::claude"
+)
+
+for func in "${test_functions[@]}"; do
+    echo "Testing function: $func"
+    
+    response=$(curl -s -X POST "$ENDPOINT/chat/completions" \
+        -H "Content-Type: application/json" \
+        -d "{\"model\":\"$func\",\"messages\":[{\"role\":\"user\",\"content\":\"test\"}]}")
+    
+    if echo "$response" | grep -q "Unknown function"; then
+        echo "  ❌ Function not found"
+    elif echo "$response" | grep -q "error"; then
+        echo "  ⚠️  Error: $(echo "$response" | jq -r '.error' 2>/dev/null || echo "$response")"
+    else
+        echo "  ✅ Function exists and responding!"
+        echo "     Use this in your environment: export NANNYAPI_MODEL=\"$func\""
+    fi
+    echo ""
+done
+
+echo "💡 If none of the above work, check your NannyAPI configuration file"
+echo "   for the correct function names and update NANNYAPI_MODEL accordingly."
+echo ""
+echo "Example NannyAPI config snippet:"
+echo "```yaml"
+echo "functions:"
+echo "  diagnose_and_heal:  # This becomes 'nannyapi::function_name::diagnose_and_heal'"
+echo "    # function definition"
+echo "```"
--- a/executor.go
+++ b/executor.go
@@ -0,0 +1,108 @@
+package main
+
+import (
+	"context"
+	"fmt"
+	"os/exec"
+	"strings"
+	"time"
+)
+
+// CommandExecutor handles safe execution of diagnostic commands
+type CommandExecutor struct {
+	timeout time.Duration
+}
+
+// NewCommandExecutor creates a new command executor with specified timeout
+func NewCommandExecutor(timeout time.Duration) *CommandExecutor {
+	return &CommandExecutor{
+		timeout: timeout,
+	}
+}
+
+// Execute executes a command safely with timeout and validation
+func (ce *CommandExecutor) Execute(cmd Command) CommandResult {
+	result := CommandResult{
+		ID:      cmd.ID,
+		Command: cmd.Command,
+	}
+
+	// Validate command safety
+	if err := ce.validateCommand(cmd.Command); err != nil {
+		result.Error = fmt.Sprintf("unsafe command: %s", err.Error())
+		result.ExitCode = 1
+		return result
+	}
+
+	// Create context with timeout
+	ctx, cancel := context.WithTimeout(context.Background(), ce.timeout)
+	defer cancel()
+
+	// Execute command using shell for proper handling of pipes, redirects, etc.
+	execCmd := exec.CommandContext(ctx, "/bin/bash", "-c", cmd.Command)
+
+	output, err := execCmd.CombinedOutput()
+	result.Output = string(output)
+
+	if err != nil {
+		result.Error = err.Error()
+		if exitError, ok := err.(*exec.ExitError); ok {
+			result.ExitCode = exitError.ExitCode()
+		} else {
+			result.ExitCode = 1
+		}
+	} else {
+		result.ExitCode = 0
+	}
+
+	return result
+}
+
+// validateCommand checks if a command is safe to execute
+func (ce *CommandExecutor) validateCommand(command string) error {
+	// Convert to lowercase for case-insensitive checking
+	cmd := strings.ToLower(strings.TrimSpace(command))
+
+	// List of dangerous commands/patterns
+	dangerousPatterns := []string{
+		"rm ", "rm\t", "rm\n",
+		"mv ", "mv\t", "mv\n",
+		"dd ", "dd\t", "dd\n",
+		"mkfs", "fdisk", "parted",
+		"shutdown", "reboot", "halt", "poweroff",
+		"passwd", "userdel", "usermod",
+		"chmod", "chown", "chgrp",
+		"systemctl stop", "systemctl disable", "systemctl mask",
+		"service stop", "service disable",
+		"kill ", "killall", "pkill",
+		"crontab -r", "crontab -e",
+		"iptables -F", "iptables -D", "iptables -I",
+		"umount ", "unmount ", // Allow mount but not umount
+		"wget ", "curl ", // Prevent network operations
+		"| dd", "| rm", "| mv", // Prevent piping to dangerous commands
+	}
+
+	// Check for dangerous patterns
+	for _, pattern := range dangerousPatterns {
+		if strings.Contains(cmd, pattern) {
+			return fmt.Errorf("command contains dangerous pattern: %s", pattern)
+		}
+	}
+
+	// Additional checks for commands that start with dangerous operations
+	if strings.HasPrefix(cmd, "rm ") || strings.HasPrefix(cmd, "rm\t") {
+		return fmt.Errorf("rm command not allowed")
+	}
+
+	// Check for sudo usage (we want to avoid automated sudo commands)
+	if strings.HasPrefix(cmd, "sudo ") {
+		return fmt.Errorf("sudo commands not allowed for automated execution")
+	}
+
+	// Check for dangerous redirections (but allow safe ones like 2>/dev/null)
+	if strings.Contains(cmd, ">") && !strings.Contains(cmd, "2>/dev/null") && !strings.Contains(cmd, ">/dev/null") {
+		return fmt.Errorf("file redirection not allowed except to /dev/null")
+	}
+
+	return nil
+}
--- a/go.mod
+++ b/go.mod
@@ -0,0 +1,5 @@
+module nannyagentv2
+
+go 1.23
+
+require github.com/sashabaranov/go-openai v1.32.0
--- a/go.sum
+++ b/go.sum
@@ -0,0 +1,2 @@
+github.com/sashabaranov/go-openai v1.32.0 h1:Yk3iE9moX3RBXxrof3OBtUBrE7qZR0zF9ebsoO4zVzI=
+github.com/sashabaranov/go-openai v1.32.0/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=
--- a/install.sh
+++ b/install.sh
@@ -0,0 +1,85 @@
+#!/bin/bash
+
+# Linux Diagnostic Agent Installation Script
+# This script installs the nanny-agent on a Linux system
+
+set -e
+
+echo "🔧 Linux Diagnostic Agent Installation Script"
+echo "=============================================="
+
+# Check if Go is installed
+if ! command -v go &> /dev/null; then
+    echo "❌ Go is not installed. Please install Go first:"
+    echo ""
+    echo "For Ubuntu/Debian:"
+    echo "  sudo apt update && sudo apt install golang-go"
+    echo ""
+    echo "For RHEL/CentOS/Fedora:"
+    echo "  sudo dnf install golang"
+    echo "  # or"
+    echo "  sudo yum install golang"
+    echo ""
+    exit 1
+fi
+
+echo "✅ Go is installed: $(go version)"
+
+# Build the application
+echo "🔨 Building the application..."
+go mod tidy
+make build
+
+# Check if build was successful
+if [ ! -f "./nanny-agent" ]; then
+    echo "❌ Build failed! nanny-agent binary not found."
+    exit 1
+fi
+
+echo "✅ Build successful!"
+
+# Ask for installation preference
+echo ""
+echo "Installation options:"
+echo "1. Install system-wide (/usr/local/bin) - requires sudo"
+echo "2. Keep in current directory"
+echo ""
+read -p "Choose option (1 or 2): " choice
+
+case $choice in
+    1)
+        echo "📦 Installing system-wide..."
+        sudo cp nanny-agent /usr/local/bin/
+        sudo chmod +x /usr/local/bin/nanny-agent
+        echo "✅ Agent installed to /usr/local/bin/nanny-agent"
+        echo ""
+        echo "You can now run the agent from anywhere with:"
+        echo "  nanny-agent"
+        ;;
+    2)
+        echo "✅ Agent ready in current directory"
+        echo ""
+        echo "Run the agent with:"
+        echo "  ./nanny-agent"
+        ;;
+    *)
+        echo "❌ Invalid choice. Agent is available in current directory."
+        echo "Run with: ./nanny-agent"
+        ;;
+esac
+
+# Configuration
+echo ""
+echo "📝 Configuration:"
+echo "Set these environment variables to configure the agent:"
+echo ""
+echo "export NANNYAPI_ENDPOINT=\"http://your-nannyapi-host:3000/openai/v1\""
+echo "export NANNYAPI_MODEL=\"your-model-identifier\""
+echo ""
+echo "Or create a .env file in the working directory."
+echo ""
+echo "🎉 Installation complete!"
+echo ""
+echo "Example usage:"
+echo "  ./nanny-agent"
+echo "  > On /var filesystem I cannot create any file but df -h shows 30% free space available."
--- a/integration-tests.sh
+++ b/integration-tests.sh
@@ -0,0 +1,116 @@
+#!/bin/bash
+
+# Linux Diagnostic Agent - Integration Tests
+# This script creates realistic Linux problem scenarios for testing
+
+set -e
+
+AGENT_BINARY="./nanny-agent"
+TEST_DIR="/tmp/nanny-agent-tests"
+TEST_LOG="$TEST_DIR/integration_test.log"
+
+# Color codes for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+
+# Ensure test directory exists
+mkdir -p "$TEST_DIR"
+
+echo -e "${BLUE}🧪 Linux Diagnostic Agent - Integration Tests${NC}"
+echo "================================================="
+echo ""
+
+# Check if agent binary exists
+if [[ ! -f "$AGENT_BINARY" ]]; then
+    echo -e "${RED}❌ Agent binary not found at $AGENT_BINARY${NC}"
+    echo "Please run: make build"
+    exit 1
+fi
+
+# Function to run a test scenario
+run_test() {
+    local test_name="$1"
+    local scenario="$2"
+    local expected_keywords="$3"
+    
+    echo -e "${YELLOW}📋 Test: $test_name${NC}"
+    echo "Scenario: $scenario"
+    echo ""
+    
+    # Run the agent with the scenario
+    echo "$scenario" | timeout 120s "$AGENT_BINARY" > "$TEST_LOG" 2>&1 || true
+    
+    # Check if any expected keywords are found in the output
+    local found_keywords=0
+    IFS=',' read -ra KEYWORDS <<< "$expected_keywords"
+    for keyword in "${KEYWORDS[@]}"; do
+        keyword=$(echo "$keyword" | xargs) # trim whitespace
+        if grep -qi "$keyword" "$TEST_LOG"; then
+            echo -e "${GREEN}  ✅ Found expected keyword: $keyword${NC}"
+            ((found_keywords++))
+        else
+            echo -e "${RED}  ❌ Missing keyword: $keyword${NC}"
+        fi
+    done
+    
+    # Show summary
+    if [[ $found_keywords -gt 0 ]]; then
+        echo -e "${GREEN}  ✅ Test PASSED ($found_keywords keywords found)${NC}"
+    else
+        echo -e "${RED}  ❌ Test FAILED (no expected keywords found)${NC}"
+    fi
+    
+    echo ""
+    echo "Full output saved to: $TEST_LOG"
+    echo "----------------------------------------"
+    echo ""
+}
+
+# Test Scenario 1: Disk Space Issues (Inode Exhaustion)
+run_test "Disk Space - Inode Exhaustion" \
+    "I cannot create new files in /home directory even though df -h shows plenty of space available. Getting 'No space left on device' error when trying to touch new files." \
+    "inode,df -i,filesystem,inodes,exhausted"
+
+# Test Scenario 2: Memory Issues
+run_test "Memory Issues - OOM Killer" \
+    "My applications keep getting killed randomly and I see 'killed' messages in logs. The system becomes unresponsive for a few seconds before recovering. This happens especially when running memory-intensive tasks." \
+    "memory,oom,killed,dmesg,free,swap"
+
+# Test Scenario 3: Network Connectivity Issues
+run_test "Network Connectivity - DNS Resolution" \
+    "I can ping IP addresses directly (like 8.8.8.8) but cannot resolve domain names. Web browsing fails with DNS resolution errors, but ping 8.8.8.8 works fine." \
+    "dns,resolv.conf,nslookup,nameserver,dig"
+
+# Test Scenario 4: Service/Process Issues
+run_test "Service Issues - High Load" \
+    "System load average is consistently above 10.0 even when CPU usage appears normal. Applications are responding slowly and I notice high wait times. The server feels sluggish overall." \
+    "load,average,cpu,iostat,vmstat,processes"
+
+# Test Scenario 5: File System Issues
+run_test "Filesystem Issues - Permission Problems" \
+    "Web server returns 403 Forbidden errors for all pages. Files exist and seem readable, but nginx logs show permission denied errors. SELinux is disabled and file permissions look correct." \
+    "permission,403,nginx,chmod,chown,selinux"
+
+# Test Scenario 6: Boot/System Issues
+run_test "Boot Issues - Kernel Module" \
+    "System boots but some hardware devices are not working. Network interface shows as down, USB devices are not recognized, and dmesg shows module loading failures." \
+    "module,lsmod,dmesg,hardware,interface,usb"
+
+# Test Scenario 7: Performance Issues
+run_test "Performance Issues - I/O Bottleneck" \
+    "Database queries are extremely slow, taking 30+ seconds for simple SELECT statements. Disk activity LED is constantly on and system feels unresponsive during database operations." \
+    "iostat,iotop,disk,database,slow,performance"
+
+echo -e "${BLUE}🏁 Integration Tests Complete${NC}"
+echo ""
+echo "Check individual test logs in: $TEST_DIR"
+echo ""
+echo -e "${YELLOW}💡 Tips:${NC}"
+echo "- Tests use realistic scenarios that could occur on production systems"
+echo "- Each test expects the AI to suggest relevant diagnostic commands"
+echo "- Review the full logs to see the complete diagnostic conversation"
+echo "- Tests timeout after 120 seconds to prevent hanging"
+echo "- Make sure NANNYAPI_ENDPOINT and NANNYAPI_MODEL are set correctly"
--- a/main.go
+++ b/main.go
@@ -0,0 +1,46 @@
+package main
+
+import (
+	"bufio"
+	"fmt"
+	"log"
+	"os"
+	"strings"
+)
+
+func main() {
+	// Initialize the agent
+	agent := NewLinuxDiagnosticAgent()
+
+	// Start the interactive session
+	fmt.Println("Linux Diagnostic Agent Started")
+	fmt.Println("Enter a system issue description (or 'quit' to exit):")
+
+	scanner := bufio.NewScanner(os.Stdin)
+	for {
+		fmt.Print("> ")
+		if !scanner.Scan() {
+			break
+		}
+
+		input := strings.TrimSpace(scanner.Text())
+		if input == "quit" || input == "exit" {
+			break
+		}
+
+		if input == "" {
+			continue
+		}
+
+		// Process the issue
+		if err := agent.DiagnoseIssue(input); err != nil {
+			fmt.Printf("Error: %v\n", err)
+		}
+	}
+
+	if err := scanner.Err(); err != nil {
+		log.Fatal(err)
+	}
+
+	fmt.Println("Goodbye!")
+}
--- a/system_info.go
+++ b/system_info.go
@@ -0,0 +1,154 @@
+package main
+
+import (
+	"fmt"
+	"net"
+	"runtime"
+	"strings"
+	"time"
+)
+
+// SystemInfo represents basic system information
+type SystemInfo struct {
+	Hostname     string `json:"hostname"`
+	OS           string `json:"os"`
+	Kernel       string `json:"kernel"`
+	Architecture string `json:"architecture"`
+	CPUCores     string `json:"cpu_cores"`
+	Memory       string `json:"memory"`
+	Uptime       string `json:"uptime"`
+	PrivateIPs   string `json:"private_ips"`
+	LoadAverage  string `json:"load_average"`
+	DiskUsage    string `json:"disk_usage"`
+}
+
+// GatherSystemInfo collects basic system information
+func GatherSystemInfo() *SystemInfo {
+	info := &SystemInfo{}
+	executor := NewCommandExecutor(5 * time.Second)
+
+	// Basic system info
+	if result := executor.Execute(Command{ID: "hostname", Command: "hostname"}); result.ExitCode == 0 {
+		info.Hostname = strings.TrimSpace(result.Output)
+	}
+
+	if result := executor.Execute(Command{ID: "os", Command: "lsb_release -d 2>/dev/null | cut -f2 || cat /etc/os-release | grep PRETTY_NAME | cut -d'=' -f2 | tr -d '\"'"}); result.ExitCode == 0 {
+		info.OS = strings.TrimSpace(result.Output)
+	}
+
+	if result := executor.Execute(Command{ID: "kernel", Command: "uname -r"}); result.ExitCode == 0 {
+		info.Kernel = strings.TrimSpace(result.Output)
+	}
+
+	if result := executor.Execute(Command{ID: "arch", Command: "uname -m"}); result.ExitCode == 0 {
+		info.Architecture = strings.TrimSpace(result.Output)
+	}
+
+	if result := executor.Execute(Command{ID: "cores", Command: "nproc"}); result.ExitCode == 0 {
+		info.CPUCores = strings.TrimSpace(result.Output)
+	}
+
+	if result := executor.Execute(Command{ID: "memory", Command: "free -h | grep Mem | awk '{print $2}'"}); result.ExitCode == 0 {
+		info.Memory = strings.TrimSpace(result.Output)
+	}
+
+	if result := executor.Execute(Command{ID: "uptime", Command: "uptime -p"}); result.ExitCode == 0 {
+		info.Uptime = strings.TrimSpace(result.Output)
+	}
+
+	if result := executor.Execute(Command{ID: "load", Command: "uptime | awk -F'load average:' '{print $2}' | xargs"}); result.ExitCode == 0 {
+		info.LoadAverage = strings.TrimSpace(result.Output)
+	}
+
+	if result := executor.Execute(Command{ID: "disk", Command: "df -h / | tail -1 | awk '{print \"Root: \" $3 \"/\" $2 \" (\" $5 \" used)\"}'"}); result.ExitCode == 0 {
+		info.DiskUsage = strings.TrimSpace(result.Output)
+	}
+
+	// Get private IP addresses
+	info.PrivateIPs = getPrivateIPs()
+
+	return info
+}
+
+// getPrivateIPs returns private IP addresses
+func getPrivateIPs() string {
+	var privateIPs []string
+
+	interfaces, err := net.Interfaces()
+	if err != nil {
+		return "Unable to determine"
+	}
+
+	for _, iface := range interfaces {
+		if iface.Flags&net.FlagUp == 0 || iface.Flags&net.FlagLoopback != 0 {
+			continue // Skip down or loopback interfaces
+		}
+
+		addrs, err := iface.Addrs()
+		if err != nil {
+			continue
+		}
+
+		for _, addr := range addrs {
+			if ipnet, ok := addr.(*net.IPNet); ok && !ipnet.IP.IsLoopback() {
+				if isPrivateIP(ipnet.IP) {
+					privateIPs = append(privateIPs, fmt.Sprintf("%s (%s)", ipnet.IP.String(), iface.Name))
+				}
+			}
+		}
+	}
+
+	if len(privateIPs) == 0 {
+		return "No private IPs found"
+	}
+
+	return strings.Join(privateIPs, ", ")
+}
+
+// isPrivateIP checks if an IP address is private
+func isPrivateIP(ip net.IP) bool {
+	// RFC 1918 private address ranges
+	private := []string{
+		"10.0.0.0/8",
+		"172.16.0.0/12",
+		"192.168.0.0/16",
+	}
+
+	for _, cidr := range private {
+		_, subnet, _ := net.ParseCIDR(cidr)
+		if subnet.Contains(ip) {
+			return true
+		}
+	}
+
+	return false
+}
+
+// FormatSystemInfoForPrompt formats system information for inclusion in diagnostic prompts
+func FormatSystemInfoForPrompt(info *SystemInfo) string {
+	return fmt.Sprintf(`SYSTEM INFORMATION:
+- Hostname: %s
+- Operating System: %s
+- Kernel Version: %s
+- Architecture: %s
+- CPU Cores: %s
+- Total Memory: %s
+- System Uptime: %s
+- Current Load Average: %s
+- Root Disk Usage: %s
+- Private IP Addresses: %s
+- Go Runtime: %s
+
+ISSUE DESCRIPTION:`,
+		info.Hostname,
+		info.OS,
+		info.Kernel,
+		info.Architecture,
+		info.CPUCores,
+		info.Memory,
+		info.Uptime,
+		info.LoadAverage,
+		info.DiskUsage,
+		info.PrivateIPs,
+		runtime.Version())
+}
--- a/test-examples.sh
+++ b/test-examples.sh
@@ -0,0 +1,82 @@
+#!/bin/bash
+
+# Linux Diagnostic Agent - Test Scenarios
+# Realistic Linux problems for testing the diagnostic agent
+
+echo "🔧 Linux Diagnostic Agent - Test Scenarios"
+echo "==========================================="
+echo ""
+
+echo "📚 Available test scenarios (copy-paste into the agent):"
+echo ""
+
+echo "1. 💾 DISK SPACE ISSUES (Inode Exhaustion):"
+echo "────────────────────────────────────────────"
+echo "I cannot create new files in /home directory even though df -h shows plenty of space available. Getting 'No space left on device' error when trying to touch new files."
+echo ""
+
+echo "2. 🧠 MEMORY ISSUES (OOM Killer):"
+echo "─────────────────────────────────"
+echo "My applications keep getting killed randomly and I see 'killed' messages in logs. The system becomes unresponsive for a few seconds before recovering. This happens especially when running memory-intensive tasks."
+echo ""
+
+echo "3. 🌐 NETWORK CONNECTIVITY (DNS Resolution):"
+echo "─────────────────────────────────────────────"
+echo "I can ping IP addresses directly (like 8.8.8.8) but cannot resolve domain names. Web browsing fails with DNS resolution errors, but ping 8.8.8.8 works fine."
+echo ""
+
+echo "4. ⚡ PERFORMANCE ISSUES (High Load):"
+echo "───────────────────────────────────"
+echo "System load average is consistently above 10.0 even when CPU usage appears normal. Applications are responding slowly and I notice high wait times. The server feels sluggish overall."
+echo ""
+
+echo "5. 🚫 WEB SERVER ISSUES (Permission Problems):"
+echo "──────────────────────────────────────────────"
+echo "Web server returns 403 Forbidden errors for all pages. Files exist and seem readable, but nginx logs show permission denied errors. SELinux is disabled and file permissions look correct."
+echo ""
+
+echo "6. 🖥️  HARDWARE/BOOT ISSUES (Kernel Module):"
+echo "─────────────────────────────────────────────"
+echo "System boots but some hardware devices are not working. Network interface shows as down, USB devices are not recognized, and dmesg shows module loading failures."
+echo ""
+
+echo "7. 🐌 DATABASE PERFORMANCE (I/O Bottleneck):"
+echo "─────────────────────────────────────────────"
+echo "Database queries are extremely slow, taking 30+ seconds for simple SELECT statements. Disk activity LED is constantly on and system feels unresponsive during database operations."
+echo ""
+
+echo "8. 🔥 HIGH CPU USAGE (Process Analysis):"
+echo "────────────────────────────────────────"
+echo "System is running slow and CPU usage is constantly at 100%. Top shows high CPU usage but I can't identify which specific process or thread is causing the issue."
+echo ""
+
+echo "9. 📁 FILE SYSTEM CORRUPTION:"
+echo "────────────────────────────"
+echo "Getting 'Input/output error' when accessing certain files and directories. Some files appear corrupted and applications crash when trying to read specific data files."
+echo ""
+
+echo "10. 🔌 SERVICE STARTUP FAILURES:"
+echo "───────────────────────────────"
+echo "Critical services fail to start after system reboot. Systemctl shows services in failed state but error messages are unclear. System appears to boot normally otherwise."
+echo ""
+
+echo "🚀 Quick Start:"
+echo "──────────────"
+echo "1. Run: ./nanny-agent"
+echo "2. Copy-paste any scenario above when prompted"
+echo "3. Watch the AI diagnose the problem step by step"
+echo ""
+
+echo "🧪 Automated Testing:"
+echo "────────────────────"
+echo "Run integration tests: ./integration-tests.sh"
+echo "This will test all scenarios automatically"
+echo ""
+
+echo "💡 Pro Tips:"
+echo "───────────"
+echo "- Each scenario is based on real-world Linux issues"
+echo "- The AI will gather system info automatically"
+echo "- Diagnostic commands are executed safely (read-only)"
+echo "- You'll get a detailed resolution plan at the end"
+echo "- Set NANNYAPI_ENDPOINT and NANNYAPI_MODEL before running"