Initial Commit
This commit is contained in:
0
Dockerfile
Normal file
0
Dockerfile
Normal file
53
Makefile
Normal file
53
Makefile
Normal file
@@ -0,0 +1,53 @@
|
|||||||
|
.PHONY: build run clean test install
|
||||||
|
|
||||||
|
# Build the application
|
||||||
|
build:
|
||||||
|
go build -o nanny-agent .
|
||||||
|
|
||||||
|
# Run the application
|
||||||
|
run: build
|
||||||
|
./nanny-agent
|
||||||
|
|
||||||
|
# Clean build artifacts
|
||||||
|
clean:
|
||||||
|
rm -f nanny-agent
|
||||||
|
|
||||||
|
# Run tests
|
||||||
|
test:
|
||||||
|
go test ./...
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
install:
|
||||||
|
go mod tidy
|
||||||
|
go mod download
|
||||||
|
|
||||||
|
# Build for production with optimizations
|
||||||
|
build-prod:
|
||||||
|
CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-w -s' -o nanny-agent .
|
||||||
|
|
||||||
|
# Install system-wide (requires sudo)
|
||||||
|
install-system: build-prod
|
||||||
|
sudo cp nanny-agent /usr/local/bin/
|
||||||
|
sudo chmod +x /usr/local/bin/nanny-agent
|
||||||
|
|
||||||
|
# Format code
|
||||||
|
fmt:
|
||||||
|
go fmt ./...
|
||||||
|
|
||||||
|
# Run linter (if golangci-lint is installed)
|
||||||
|
lint:
|
||||||
|
golangci-lint run
|
||||||
|
|
||||||
|
# Show help
|
||||||
|
help:
|
||||||
|
@echo "Available commands:"
|
||||||
|
@echo " build - Build the application"
|
||||||
|
@echo " run - Build and run the application"
|
||||||
|
@echo " clean - Clean build artifacts"
|
||||||
|
@echo " test - Run tests"
|
||||||
|
@echo " install - Install dependencies"
|
||||||
|
@echo " build-prod - Build for production"
|
||||||
|
@echo " install-system- Install system-wide (requires sudo)"
|
||||||
|
@echo " fmt - Format code"
|
||||||
|
@echo " lint - Run linter"
|
||||||
|
@echo " help - Show this help"
|
||||||
200
README.md
200
README.md
@@ -1,3 +1,199 @@
|
|||||||
# nannyagent
|
# Linux Diagnostic Agent
|
||||||
|
|
||||||
nannyagent is a Linux AI diagnostic agent built on OpenAPI specifications relying on Tensorzero gateway
|
A Go-based AI agent that diagnoses Linux system issues using the NannyAPI gateway with OpenAI-compatible SDK.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- Interactive command-line interface for submitting system issues
|
||||||
|
- **Automatic system information gathering** - Includes OS, kernel, CPU, memory, network info
|
||||||
|
- Integrates with NannyAPI using OpenAI-compatible Go SDK
|
||||||
|
- Executes diagnostic commands safely and collects output
|
||||||
|
- Provides step-by-step resolution plans
|
||||||
|
- **Comprehensive integration tests** with realistic Linux problem scenarios
|
||||||
|
|
||||||
|
## Setup
|
||||||
|
|
||||||
|
1. Clone this repository
|
||||||
|
2. Copy `.env.example` to `.env` and configure your NannyAPI endpoint:
|
||||||
|
```bash
|
||||||
|
cp .env.example .env
|
||||||
|
```
|
||||||
|
3. Install dependencies:
|
||||||
|
```bash
|
||||||
|
go mod tidy
|
||||||
|
```
|
||||||
|
4. Build and run:
|
||||||
|
```bash
|
||||||
|
make build
|
||||||
|
./nanny-agent
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
The agent can be configured using environment variables:
|
||||||
|
|
||||||
|
- `NANNYAPI_ENDPOINT`: The NannyAPI endpoint (default: `http://nannyapi.local:3000/openai/v1`)
|
||||||
|
- `NANNYAPI_MODEL`: The model identifier (default: `nannyapi::function_name::diagnose_and_heal`)
|
||||||
|
|
||||||
|
## Installation on Linux VM
|
||||||
|
|
||||||
|
### Direct Installation
|
||||||
|
|
||||||
|
1. **Install Go** (if not already installed):
|
||||||
|
```bash
|
||||||
|
# For Ubuntu/Debian
|
||||||
|
sudo apt update
|
||||||
|
sudo apt install golang-go
|
||||||
|
|
||||||
|
# For RHEL/CentOS/Fedora
|
||||||
|
sudo dnf install golang
|
||||||
|
# or
|
||||||
|
sudo yum install golang
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Clone and build the agent**:
|
||||||
|
```bash
|
||||||
|
git clone <your-repo-url>
|
||||||
|
cd nannyagentv2
|
||||||
|
go mod tidy
|
||||||
|
make build
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Install as system service** (optional):
|
||||||
|
```bash
|
||||||
|
sudo cp nanny-agent /usr/local/bin/
|
||||||
|
sudo chmod +x /usr/local/bin/nanny-agent
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Set environment variables**:
|
||||||
|
```bash
|
||||||
|
export NANNYAPI_ENDPOINT="http://your-nannyapi-endpoint:3000/openai/v1"
|
||||||
|
export NANNYAPI_MODEL="your-model-identifier"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
1. Start the agent:
|
||||||
|
```bash
|
||||||
|
./nanny-agent
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Enter a system issue description when prompted:
|
||||||
|
```
|
||||||
|
> On /var filesystem I cannot create any file but df -h shows 30% free space available.
|
||||||
|
```
|
||||||
|
|
||||||
|
3. The agent will:
|
||||||
|
- Send the issue to the AI via NannyAPI using OpenAI SDK
|
||||||
|
- Execute diagnostic commands as suggested by the AI
|
||||||
|
- Provide command outputs back to the AI
|
||||||
|
- Display the final diagnosis and resolution plan
|
||||||
|
|
||||||
|
4. Type `quit` or `exit` to stop the agent
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
1. **System Information Gathering**: Agent automatically collects system details (OS, kernel, CPU, memory, network, etc.)
|
||||||
|
2. **Initial Issue**: User describes a Linux system problem
|
||||||
|
3. **Enhanced Prompt**: AI receives both the issue description and comprehensive system information
|
||||||
|
4. **Diagnostic Phase**: AI responds with diagnostic commands to run
|
||||||
|
5. **Command Execution**: Agent safely executes read-only commands
|
||||||
|
6. **Iterative Analysis**: AI analyzes command outputs and may request more commands
|
||||||
|
7. **Resolution Phase**: AI provides root cause analysis and step-by-step resolution plan
|
||||||
|
|
||||||
|
## Testing & Integration Tests
|
||||||
|
|
||||||
|
The agent includes comprehensive integration tests that simulate realistic Linux problems:
|
||||||
|
|
||||||
|
### Available Test Scenarios:
|
||||||
|
1. **Disk Space Issues** - Inode exhaustion scenarios
|
||||||
|
2. **Memory Problems** - OOM killer and memory pressure
|
||||||
|
3. **Network Issues** - DNS resolution problems
|
||||||
|
4. **Performance Issues** - High load averages and I/O bottlenecks
|
||||||
|
5. **Web Server Problems** - Permission and configuration issues
|
||||||
|
6. **Hardware/Boot Issues** - Kernel module and device problems
|
||||||
|
7. **Database Performance** - Slow queries and I/O contention
|
||||||
|
8. **Service Failures** - Startup and configuration problems
|
||||||
|
|
||||||
|
### Run Integration Tests:
|
||||||
|
```bash
|
||||||
|
# Interactive test scenarios
|
||||||
|
./test-examples.sh
|
||||||
|
|
||||||
|
# Automated integration tests
|
||||||
|
./integration-tests.sh
|
||||||
|
|
||||||
|
# Function discovery (find valid NannyAPI functions)
|
||||||
|
./discover-functions.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## Safety
|
||||||
|
|
||||||
|
- Only read-only commands are executed automatically
|
||||||
|
- Commands that modify the system (rm, mv, dd, redirection) are blocked by validation
|
||||||
|
- The resolution plan is provided for manual execution by the operator
|
||||||
|
- All commands have execution timeouts to prevent hanging
|
||||||
|
|
||||||
|
## API Integration
|
||||||
|
|
||||||
|
The agent uses the `github.com/sashabaranov/go-openai` SDK to communicate with NannyAPI's OpenAI-compatible API endpoint. This provides:
|
||||||
|
|
||||||
|
- Robust HTTP client with retries and timeouts
|
||||||
|
- Structured request/response handling
|
||||||
|
- Automatic JSON marshaling/unmarshaling
|
||||||
|
- Error handling and validation
|
||||||
|
|
||||||
|
## Example Session
|
||||||
|
|
||||||
|
```
|
||||||
|
Linux Diagnostic Agent Started
|
||||||
|
Enter a system issue description (or 'quit' to exit):
|
||||||
|
> Cannot create files in /var but df shows space available
|
||||||
|
|
||||||
|
Diagnosing issue: Cannot create files in /var but df shows space available
|
||||||
|
Gathering system information...
|
||||||
|
|
||||||
|
AI Response:
|
||||||
|
{
|
||||||
|
"response_type": "diagnostic",
|
||||||
|
"reasoning": "The 'No space left on device' error despite available disk space suggests inode exhaustion...",
|
||||||
|
"commands": [
|
||||||
|
{"id": "check_inodes", "command": "df -i /var", "description": "Check inode usage..."}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
|
||||||
|
Executing command 'check_inodes': df -i /var
|
||||||
|
Output:
|
||||||
|
Filesystem Inodes IUsed IFree IUse% Mounted on
|
||||||
|
/dev/sda1 1000000 999999 1 100% /var
|
||||||
|
|
||||||
|
=== DIAGNOSIS COMPLETE ===
|
||||||
|
Root Cause: The /var filesystem has exhausted all available inodes
|
||||||
|
Resolution Plan: 1. Find and remove unnecessary files...
|
||||||
|
Confidence: High
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: The AI receives comprehensive system information including:
|
||||||
|
- Hostname, OS version, kernel version
|
||||||
|
- CPU cores, memory, system uptime
|
||||||
|
- Network interfaces and private IPs
|
||||||
|
- Current load average and disk usage
|
||||||
|
|
||||||
|
## Available Make Commands
|
||||||
|
|
||||||
|
- `make build` - Build the application
|
||||||
|
- `make run` - Build and run the application
|
||||||
|
- `make clean` - Clean build artifacts
|
||||||
|
- `make test` - Run unit tests
|
||||||
|
- `make install` - Install dependencies
|
||||||
|
- `make build-prod` - Build for production
|
||||||
|
- `make install-system` - Install system-wide (requires sudo)
|
||||||
|
- `make fmt` - Format code
|
||||||
|
- `make help` - Show available commands
|
||||||
|
|
||||||
|
## Testing Commands
|
||||||
|
|
||||||
|
- `./test-examples.sh` - Show interactive test scenarios
|
||||||
|
- `./integration-tests.sh` - Run automated integration tests
|
||||||
|
- `./discover-functions.sh` - Find available NannyAPI functions
|
||||||
|
- `./install.sh` - Installation script for Linux VMs
|
||||||
|
|||||||
270
agent.go
Normal file
270
agent.go
Normal file
@@ -0,0 +1,270 @@
|
|||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"bytes"
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"fmt"
|
||||||
|
"io"
|
||||||
|
"net/http"
|
||||||
|
"os"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/sashabaranov/go-openai"
|
||||||
|
)
|
||||||
|
|
||||||
|
// DiagnosticResponse represents the diagnostic phase response from AI
|
||||||
|
type DiagnosticResponse struct {
|
||||||
|
ResponseType string `json:"response_type"`
|
||||||
|
Reasoning string `json:"reasoning"`
|
||||||
|
Commands []Command `json:"commands"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// ResolutionResponse represents the resolution phase response from AI
|
||||||
|
type ResolutionResponse struct {
|
||||||
|
ResponseType string `json:"response_type"`
|
||||||
|
RootCause string `json:"root_cause"`
|
||||||
|
ResolutionPlan string `json:"resolution_plan"`
|
||||||
|
Confidence string `json:"confidence"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// Command represents a command to be executed
|
||||||
|
type Command struct {
|
||||||
|
ID string `json:"id"`
|
||||||
|
Command string `json:"command"`
|
||||||
|
Description string `json:"description"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// CommandResult represents the result of executing a command
|
||||||
|
type CommandResult struct {
|
||||||
|
ID string `json:"id"`
|
||||||
|
Command string `json:"command"`
|
||||||
|
Output string `json:"output"`
|
||||||
|
ExitCode int `json:"exit_code"`
|
||||||
|
Error string `json:"error,omitempty"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// LinuxDiagnosticAgent represents the main agent
|
||||||
|
type LinuxDiagnosticAgent struct {
|
||||||
|
client *openai.Client
|
||||||
|
model string
|
||||||
|
executor *CommandExecutor
|
||||||
|
episodeID string // TensorZero episode ID for conversation continuity
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewLinuxDiagnosticAgent creates a new diagnostic agent
|
||||||
|
func NewLinuxDiagnosticAgent() *LinuxDiagnosticAgent {
|
||||||
|
endpoint := os.Getenv("NANNYAPI_ENDPOINT")
|
||||||
|
if endpoint == "" {
|
||||||
|
// Default endpoint - OpenAI SDK will append /chat/completions automatically
|
||||||
|
endpoint = "http://nannyapi.local:3000/openai/v1"
|
||||||
|
}
|
||||||
|
|
||||||
|
model := os.Getenv("NANNYAPI_MODEL")
|
||||||
|
if model == "" {
|
||||||
|
model = "nannyapi::function_name::diagnose_and_heal"
|
||||||
|
fmt.Printf("Warning: Using default model '%s'. Set NANNYAPI_MODEL environment variable for your specific function.\n", model)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create OpenAI client with custom base URL
|
||||||
|
// Note: The OpenAI SDK automatically appends "/chat/completions" to the base URL
|
||||||
|
config := openai.DefaultConfig("")
|
||||||
|
config.BaseURL = endpoint
|
||||||
|
client := openai.NewClientWithConfig(config)
|
||||||
|
|
||||||
|
return &LinuxDiagnosticAgent{
|
||||||
|
client: client,
|
||||||
|
model: model,
|
||||||
|
executor: NewCommandExecutor(10 * time.Second), // 10 second timeout for commands
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// DiagnoseIssue starts the diagnostic process for a given issue
|
||||||
|
func (a *LinuxDiagnosticAgent) DiagnoseIssue(issue string) error {
|
||||||
|
fmt.Printf("Diagnosing issue: %s\n", issue)
|
||||||
|
fmt.Println("Gathering system information...")
|
||||||
|
|
||||||
|
// Gather system information
|
||||||
|
systemInfo := GatherSystemInfo()
|
||||||
|
|
||||||
|
// Format the initial prompt with system information
|
||||||
|
initialPrompt := FormatSystemInfoForPrompt(systemInfo) + "\n" + issue
|
||||||
|
|
||||||
|
// Start conversation with initial issue including system info
|
||||||
|
messages := []openai.ChatCompletionMessage{
|
||||||
|
{
|
||||||
|
Role: openai.ChatMessageRoleUser,
|
||||||
|
Content: initialPrompt,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
for {
|
||||||
|
// Send request to TensorZero API via OpenAI SDK
|
||||||
|
response, err := a.sendRequest(messages)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to send request: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
if len(response.Choices) == 0 {
|
||||||
|
return fmt.Errorf("no choices in response")
|
||||||
|
}
|
||||||
|
|
||||||
|
content := response.Choices[0].Message.Content
|
||||||
|
fmt.Printf("\nAI Response:\n%s\n", content)
|
||||||
|
|
||||||
|
// Parse the response to determine next action
|
||||||
|
var diagnosticResp DiagnosticResponse
|
||||||
|
var resolutionResp ResolutionResponse
|
||||||
|
|
||||||
|
// Try to parse as diagnostic response first
|
||||||
|
if err := json.Unmarshal([]byte(content), &diagnosticResp); err == nil && diagnosticResp.ResponseType == "diagnostic" {
|
||||||
|
// Handle diagnostic phase
|
||||||
|
fmt.Printf("\nReasoning: %s\n", diagnosticResp.Reasoning)
|
||||||
|
|
||||||
|
if len(diagnosticResp.Commands) == 0 {
|
||||||
|
fmt.Println("No commands to execute in diagnostic phase")
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// Execute commands and collect results
|
||||||
|
commandResults := make([]CommandResult, 0, len(diagnosticResp.Commands))
|
||||||
|
for _, cmd := range diagnosticResp.Commands {
|
||||||
|
fmt.Printf("\nExecuting command '%s': %s\n", cmd.ID, cmd.Command)
|
||||||
|
result := a.executor.Execute(cmd)
|
||||||
|
commandResults = append(commandResults, result)
|
||||||
|
|
||||||
|
fmt.Printf("Output:\n%s\n", result.Output)
|
||||||
|
if result.Error != "" {
|
||||||
|
fmt.Printf("Error: %s\n", result.Error)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Prepare command results as user message
|
||||||
|
resultsJSON, err := json.MarshalIndent(commandResults, "", " ")
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to marshal command results: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Add AI response and command results to conversation
|
||||||
|
messages = append(messages, openai.ChatCompletionMessage{
|
||||||
|
Role: openai.ChatMessageRoleAssistant,
|
||||||
|
Content: content,
|
||||||
|
})
|
||||||
|
messages = append(messages, openai.ChatCompletionMessage{
|
||||||
|
Role: openai.ChatMessageRoleUser,
|
||||||
|
Content: string(resultsJSON),
|
||||||
|
})
|
||||||
|
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
// Try to parse as resolution response
|
||||||
|
if err := json.Unmarshal([]byte(content), &resolutionResp); err == nil && resolutionResp.ResponseType == "resolution" {
|
||||||
|
// Handle resolution phase
|
||||||
|
fmt.Printf("\n=== DIAGNOSIS COMPLETE ===\n")
|
||||||
|
fmt.Printf("Root Cause: %s\n", resolutionResp.RootCause)
|
||||||
|
fmt.Printf("Resolution Plan: %s\n", resolutionResp.ResolutionPlan)
|
||||||
|
fmt.Printf("Confidence: %s\n", resolutionResp.Confidence)
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// If we can't parse the response, treat it as an error or unexpected format
|
||||||
|
fmt.Printf("Unexpected response format or error from AI:\n%s\n", content)
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// TensorZeroRequest represents a request structure compatible with TensorZero's episode_id
|
||||||
|
type TensorZeroRequest struct {
|
||||||
|
Model string `json:"model"`
|
||||||
|
Messages []openai.ChatCompletionMessage `json:"messages"`
|
||||||
|
EpisodeID string `json:"tensorzero::episode_id,omitempty"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// TensorZeroResponse represents TensorZero's response with episode_id
|
||||||
|
type TensorZeroResponse struct {
|
||||||
|
openai.ChatCompletionResponse
|
||||||
|
EpisodeID string `json:"episode_id"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// sendRequest sends a request to the TensorZero API with tensorzero::episode_id support
|
||||||
|
func (a *LinuxDiagnosticAgent) sendRequest(messages []openai.ChatCompletionMessage) (*openai.ChatCompletionResponse, error) {
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
|
||||||
|
defer cancel()
|
||||||
|
|
||||||
|
// Create TensorZero-compatible request
|
||||||
|
tzRequest := TensorZeroRequest{
|
||||||
|
Model: a.model,
|
||||||
|
Messages: messages,
|
||||||
|
}
|
||||||
|
|
||||||
|
// Include tensorzero::episode_id for conversation continuity (if we have one)
|
||||||
|
if a.episodeID != "" {
|
||||||
|
tzRequest.EpisodeID = a.episodeID
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Printf("Debug: Sending request to model: %s", a.model)
|
||||||
|
if a.episodeID != "" {
|
||||||
|
fmt.Printf(" (episode: %s)", a.episodeID)
|
||||||
|
}
|
||||||
|
fmt.Println()
|
||||||
|
|
||||||
|
// Marshal the request
|
||||||
|
requestBody, err := json.Marshal(tzRequest)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to marshal request: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create HTTP request
|
||||||
|
endpoint := os.Getenv("NANNYAPI_ENDPOINT")
|
||||||
|
if endpoint == "" {
|
||||||
|
endpoint = "http://nannyapi.local:3000/openai/v1"
|
||||||
|
}
|
||||||
|
|
||||||
|
// Ensure the endpoint ends with /chat/completions
|
||||||
|
if endpoint[len(endpoint)-1] != '/' {
|
||||||
|
endpoint += "/"
|
||||||
|
}
|
||||||
|
endpoint += "chat/completions"
|
||||||
|
|
||||||
|
req, err := http.NewRequestWithContext(ctx, "POST", endpoint, bytes.NewBuffer(requestBody))
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to create request: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
req.Header.Set("Content-Type", "application/json")
|
||||||
|
|
||||||
|
// Make the request
|
||||||
|
client := &http.Client{Timeout: 30 * time.Second}
|
||||||
|
resp, err := client.Do(req)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to send request: %w", err)
|
||||||
|
}
|
||||||
|
defer resp.Body.Close()
|
||||||
|
|
||||||
|
// Read response body
|
||||||
|
body, err := io.ReadAll(resp.Body)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to read response: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
if resp.StatusCode != http.StatusOK {
|
||||||
|
return nil, fmt.Errorf("API request failed with status %d: %s", resp.StatusCode, string(body))
|
||||||
|
}
|
||||||
|
|
||||||
|
// Parse TensorZero response
|
||||||
|
var tzResponse TensorZeroResponse
|
||||||
|
if err := json.Unmarshal(body, &tzResponse); err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to unmarshal response: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Extract episode_id from first response
|
||||||
|
if a.episodeID == "" && tzResponse.EpisodeID != "" {
|
||||||
|
a.episodeID = tzResponse.EpisodeID
|
||||||
|
fmt.Printf("Debug: Extracted episode ID: %s\n", a.episodeID)
|
||||||
|
}
|
||||||
|
|
||||||
|
return &tzResponse.ChatCompletionResponse, nil
|
||||||
|
}
|
||||||
107
agent_test.go
Normal file
107
agent_test.go
Normal file
@@ -0,0 +1,107 @@
|
|||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"testing"
|
||||||
|
"time"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestCommandExecutor_ValidateCommand(t *testing.T) {
|
||||||
|
executor := NewCommandExecutor(5 * time.Second)
|
||||||
|
|
||||||
|
tests := []struct {
|
||||||
|
name string
|
||||||
|
command string
|
||||||
|
wantErr bool
|
||||||
|
}{
|
||||||
|
{
|
||||||
|
name: "safe command - ls",
|
||||||
|
command: "ls -la /var",
|
||||||
|
wantErr: false,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "safe command - df",
|
||||||
|
command: "df -h",
|
||||||
|
wantErr: false,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "safe command - ps",
|
||||||
|
command: "ps aux | grep nginx",
|
||||||
|
wantErr: false,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "dangerous command - rm",
|
||||||
|
command: "rm -rf /tmp/*",
|
||||||
|
wantErr: true,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "dangerous command - dd",
|
||||||
|
command: "dd if=/dev/zero of=/dev/sda",
|
||||||
|
wantErr: true,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "dangerous command - sudo",
|
||||||
|
command: "sudo systemctl stop nginx",
|
||||||
|
wantErr: true,
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: "dangerous command - redirection",
|
||||||
|
command: "echo 'test' > /etc/passwd",
|
||||||
|
wantErr: true,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tt := range tests {
|
||||||
|
t.Run(tt.name, func(t *testing.T) {
|
||||||
|
err := executor.validateCommand(tt.command)
|
||||||
|
if (err != nil) != tt.wantErr {
|
||||||
|
t.Errorf("validateCommand() error = %v, wantErr %v", err, tt.wantErr)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestCommandExecutor_Execute(t *testing.T) {
|
||||||
|
executor := NewCommandExecutor(5 * time.Second)
|
||||||
|
|
||||||
|
// Test safe command execution
|
||||||
|
cmd := Command{
|
||||||
|
ID: "test_echo",
|
||||||
|
Command: "echo 'Hello, World!'",
|
||||||
|
Description: "Test echo command",
|
||||||
|
}
|
||||||
|
|
||||||
|
result := executor.Execute(cmd)
|
||||||
|
|
||||||
|
if result.ExitCode != 0 {
|
||||||
|
t.Errorf("Expected exit code 0, got %d", result.ExitCode)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result.Output != "Hello, World!\n" {
|
||||||
|
t.Errorf("Expected 'Hello, World!\\n', got '%s'", result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result.Error != "" {
|
||||||
|
t.Errorf("Expected no error, got '%s'", result.Error)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestCommandExecutor_ExecuteUnsafeCommand(t *testing.T) {
|
||||||
|
executor := NewCommandExecutor(5 * time.Second)
|
||||||
|
|
||||||
|
// Test unsafe command rejection
|
||||||
|
cmd := Command{
|
||||||
|
ID: "test_rm",
|
||||||
|
Command: "rm -rf /tmp/test",
|
||||||
|
Description: "Dangerous rm command",
|
||||||
|
}
|
||||||
|
|
||||||
|
result := executor.Execute(cmd)
|
||||||
|
|
||||||
|
if result.ExitCode != 1 {
|
||||||
|
t.Errorf("Expected exit code 1 for unsafe command, got %d", result.ExitCode)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result.Error == "" {
|
||||||
|
t.Error("Expected error for unsafe command, got none")
|
||||||
|
}
|
||||||
|
}
|
||||||
51
discover-functions.sh
Executable file
51
discover-functions.sh
Executable file
@@ -0,0 +1,51 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# NannyAPI Function Discovery Script
|
||||||
|
# This script helps you find the correct function name for your NannyAPI setup
|
||||||
|
|
||||||
|
echo "🔍 NannyAPI Function Discovery"
|
||||||
|
echo "=============================="
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
ENDPOINT="${NANNYAPI_ENDPOINT:-http://nannyapi.local:3000/openai/v1}"
|
||||||
|
|
||||||
|
echo "Testing endpoint: $ENDPOINT/chat/completions"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Test common function name patterns
|
||||||
|
test_functions=(
|
||||||
|
"nannyapi::function_name::diagnose"
|
||||||
|
"nannyapi::function_name::diagnose_and_heal"
|
||||||
|
"nannyapi::function_name::linux_diagnostic"
|
||||||
|
"nannyapi::function_name::system_diagnostic"
|
||||||
|
"nannyapi::model_name::gpt-4"
|
||||||
|
"nannyapi::model_name::claude"
|
||||||
|
)
|
||||||
|
|
||||||
|
for func in "${test_functions[@]}"; do
|
||||||
|
echo "Testing function: $func"
|
||||||
|
|
||||||
|
response=$(curl -s -X POST "$ENDPOINT/chat/completions" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d "{\"model\":\"$func\",\"messages\":[{\"role\":\"user\",\"content\":\"test\"}]}")
|
||||||
|
|
||||||
|
if echo "$response" | grep -q "Unknown function"; then
|
||||||
|
echo " ❌ Function not found"
|
||||||
|
elif echo "$response" | grep -q "error"; then
|
||||||
|
echo " ⚠️ Error: $(echo "$response" | jq -r '.error' 2>/dev/null || echo "$response")"
|
||||||
|
else
|
||||||
|
echo " ✅ Function exists and responding!"
|
||||||
|
echo " Use this in your environment: export NANNYAPI_MODEL=\"$func\""
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
done
|
||||||
|
|
||||||
|
echo "💡 If none of the above work, check your NannyAPI configuration file"
|
||||||
|
echo " for the correct function names and update NANNYAPI_MODEL accordingly."
|
||||||
|
echo ""
|
||||||
|
echo "Example NannyAPI config snippet:"
|
||||||
|
echo "```yaml"
|
||||||
|
echo "functions:"
|
||||||
|
echo " diagnose_and_heal: # This becomes 'nannyapi::function_name::diagnose_and_heal'"
|
||||||
|
echo " # function definition"
|
||||||
|
echo "```"
|
||||||
108
executor.go
Normal file
108
executor.go
Normal file
@@ -0,0 +1,108 @@
|
|||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"fmt"
|
||||||
|
"os/exec"
|
||||||
|
"strings"
|
||||||
|
"time"
|
||||||
|
)
|
||||||
|
|
||||||
|
// CommandExecutor handles safe execution of diagnostic commands
|
||||||
|
type CommandExecutor struct {
|
||||||
|
timeout time.Duration
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewCommandExecutor creates a new command executor with specified timeout
|
||||||
|
func NewCommandExecutor(timeout time.Duration) *CommandExecutor {
|
||||||
|
return &CommandExecutor{
|
||||||
|
timeout: timeout,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Execute executes a command safely with timeout and validation
|
||||||
|
func (ce *CommandExecutor) Execute(cmd Command) CommandResult {
|
||||||
|
result := CommandResult{
|
||||||
|
ID: cmd.ID,
|
||||||
|
Command: cmd.Command,
|
||||||
|
}
|
||||||
|
|
||||||
|
// Validate command safety
|
||||||
|
if err := ce.validateCommand(cmd.Command); err != nil {
|
||||||
|
result.Error = fmt.Sprintf("unsafe command: %s", err.Error())
|
||||||
|
result.ExitCode = 1
|
||||||
|
return result
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create context with timeout
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), ce.timeout)
|
||||||
|
defer cancel()
|
||||||
|
|
||||||
|
// Execute command using shell for proper handling of pipes, redirects, etc.
|
||||||
|
execCmd := exec.CommandContext(ctx, "/bin/bash", "-c", cmd.Command)
|
||||||
|
|
||||||
|
output, err := execCmd.CombinedOutput()
|
||||||
|
result.Output = string(output)
|
||||||
|
|
||||||
|
if err != nil {
|
||||||
|
result.Error = err.Error()
|
||||||
|
if exitError, ok := err.(*exec.ExitError); ok {
|
||||||
|
result.ExitCode = exitError.ExitCode()
|
||||||
|
} else {
|
||||||
|
result.ExitCode = 1
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
result.ExitCode = 0
|
||||||
|
}
|
||||||
|
|
||||||
|
return result
|
||||||
|
}
|
||||||
|
|
||||||
|
// validateCommand checks if a command is safe to execute
|
||||||
|
func (ce *CommandExecutor) validateCommand(command string) error {
|
||||||
|
// Convert to lowercase for case-insensitive checking
|
||||||
|
cmd := strings.ToLower(strings.TrimSpace(command))
|
||||||
|
|
||||||
|
// List of dangerous commands/patterns
|
||||||
|
dangerousPatterns := []string{
|
||||||
|
"rm ", "rm\t", "rm\n",
|
||||||
|
"mv ", "mv\t", "mv\n",
|
||||||
|
"dd ", "dd\t", "dd\n",
|
||||||
|
"mkfs", "fdisk", "parted",
|
||||||
|
"shutdown", "reboot", "halt", "poweroff",
|
||||||
|
"passwd", "userdel", "usermod",
|
||||||
|
"chmod", "chown", "chgrp",
|
||||||
|
"systemctl stop", "systemctl disable", "systemctl mask",
|
||||||
|
"service stop", "service disable",
|
||||||
|
"kill ", "killall", "pkill",
|
||||||
|
"crontab -r", "crontab -e",
|
||||||
|
"iptables -F", "iptables -D", "iptables -I",
|
||||||
|
"umount ", "unmount ", // Allow mount but not umount
|
||||||
|
"wget ", "curl ", // Prevent network operations
|
||||||
|
"| dd", "| rm", "| mv", // Prevent piping to dangerous commands
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check for dangerous patterns
|
||||||
|
for _, pattern := range dangerousPatterns {
|
||||||
|
if strings.Contains(cmd, pattern) {
|
||||||
|
return fmt.Errorf("command contains dangerous pattern: %s", pattern)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Additional checks for commands that start with dangerous operations
|
||||||
|
if strings.HasPrefix(cmd, "rm ") || strings.HasPrefix(cmd, "rm\t") {
|
||||||
|
return fmt.Errorf("rm command not allowed")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check for sudo usage (we want to avoid automated sudo commands)
|
||||||
|
if strings.HasPrefix(cmd, "sudo ") {
|
||||||
|
return fmt.Errorf("sudo commands not allowed for automated execution")
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check for dangerous redirections (but allow safe ones like 2>/dev/null)
|
||||||
|
if strings.Contains(cmd, ">") && !strings.Contains(cmd, "2>/dev/null") && !strings.Contains(cmd, ">/dev/null") {
|
||||||
|
return fmt.Errorf("file redirection not allowed except to /dev/null")
|
||||||
|
}
|
||||||
|
|
||||||
|
return nil
|
||||||
|
}
|
||||||
5
go.mod
Normal file
5
go.mod
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
module nannyagentv2
|
||||||
|
|
||||||
|
go 1.23
|
||||||
|
|
||||||
|
require github.com/sashabaranov/go-openai v1.32.0
|
||||||
2
go.sum
Normal file
2
go.sum
Normal file
@@ -0,0 +1,2 @@
|
|||||||
|
github.com/sashabaranov/go-openai v1.32.0 h1:Yk3iE9moX3RBXxrof3OBtUBrE7qZR0zF9ebsoO4zVzI=
|
||||||
|
github.com/sashabaranov/go-openai v1.32.0/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=
|
||||||
85
install.sh
Executable file
85
install.sh
Executable file
@@ -0,0 +1,85 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Linux Diagnostic Agent Installation Script
|
||||||
|
# This script installs the nanny-agent on a Linux system
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
echo "🔧 Linux Diagnostic Agent Installation Script"
|
||||||
|
echo "=============================================="
|
||||||
|
|
||||||
|
# Check if Go is installed
|
||||||
|
if ! command -v go &> /dev/null; then
|
||||||
|
echo "❌ Go is not installed. Please install Go first:"
|
||||||
|
echo ""
|
||||||
|
echo "For Ubuntu/Debian:"
|
||||||
|
echo " sudo apt update && sudo apt install golang-go"
|
||||||
|
echo ""
|
||||||
|
echo "For RHEL/CentOS/Fedora:"
|
||||||
|
echo " sudo dnf install golang"
|
||||||
|
echo " # or"
|
||||||
|
echo " sudo yum install golang"
|
||||||
|
echo ""
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "✅ Go is installed: $(go version)"
|
||||||
|
|
||||||
|
# Build the application
|
||||||
|
echo "🔨 Building the application..."
|
||||||
|
go mod tidy
|
||||||
|
make build
|
||||||
|
|
||||||
|
# Check if build was successful
|
||||||
|
if [ ! -f "./nanny-agent" ]; then
|
||||||
|
echo "❌ Build failed! nanny-agent binary not found."
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "✅ Build successful!"
|
||||||
|
|
||||||
|
# Ask for installation preference
|
||||||
|
echo ""
|
||||||
|
echo "Installation options:"
|
||||||
|
echo "1. Install system-wide (/usr/local/bin) - requires sudo"
|
||||||
|
echo "2. Keep in current directory"
|
||||||
|
echo ""
|
||||||
|
read -p "Choose option (1 or 2): " choice
|
||||||
|
|
||||||
|
case $choice in
|
||||||
|
1)
|
||||||
|
echo "📦 Installing system-wide..."
|
||||||
|
sudo cp nanny-agent /usr/local/bin/
|
||||||
|
sudo chmod +x /usr/local/bin/nanny-agent
|
||||||
|
echo "✅ Agent installed to /usr/local/bin/nanny-agent"
|
||||||
|
echo ""
|
||||||
|
echo "You can now run the agent from anywhere with:"
|
||||||
|
echo " nanny-agent"
|
||||||
|
;;
|
||||||
|
2)
|
||||||
|
echo "✅ Agent ready in current directory"
|
||||||
|
echo ""
|
||||||
|
echo "Run the agent with:"
|
||||||
|
echo " ./nanny-agent"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
echo "❌ Invalid choice. Agent is available in current directory."
|
||||||
|
echo "Run with: ./nanny-agent"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
# Configuration
|
||||||
|
echo ""
|
||||||
|
echo "📝 Configuration:"
|
||||||
|
echo "Set these environment variables to configure the agent:"
|
||||||
|
echo ""
|
||||||
|
echo "export NANNYAPI_ENDPOINT=\"http://your-nannyapi-host:3000/openai/v1\""
|
||||||
|
echo "export NANNYAPI_MODEL=\"your-model-identifier\""
|
||||||
|
echo ""
|
||||||
|
echo "Or create a .env file in the working directory."
|
||||||
|
echo ""
|
||||||
|
echo "🎉 Installation complete!"
|
||||||
|
echo ""
|
||||||
|
echo "Example usage:"
|
||||||
|
echo " ./nanny-agent"
|
||||||
|
echo " > On /var filesystem I cannot create any file but df -h shows 30% free space available."
|
||||||
116
integration-tests.sh
Executable file
116
integration-tests.sh
Executable file
@@ -0,0 +1,116 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Linux Diagnostic Agent - Integration Tests
|
||||||
|
# This script creates realistic Linux problem scenarios for testing
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
AGENT_BINARY="./nanny-agent"
|
||||||
|
TEST_DIR="/tmp/nanny-agent-tests"
|
||||||
|
TEST_LOG="$TEST_DIR/integration_test.log"
|
||||||
|
|
||||||
|
# Color codes for output
|
||||||
|
RED='\033[0;31m'
|
||||||
|
GREEN='\033[0;32m'
|
||||||
|
YELLOW='\033[1;33m'
|
||||||
|
BLUE='\033[0;34m'
|
||||||
|
NC='\033[0m' # No Color
|
||||||
|
|
||||||
|
# Ensure test directory exists
|
||||||
|
mkdir -p "$TEST_DIR"
|
||||||
|
|
||||||
|
echo -e "${BLUE}🧪 Linux Diagnostic Agent - Integration Tests${NC}"
|
||||||
|
echo "================================================="
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Check if agent binary exists
|
||||||
|
if [[ ! -f "$AGENT_BINARY" ]]; then
|
||||||
|
echo -e "${RED}❌ Agent binary not found at $AGENT_BINARY${NC}"
|
||||||
|
echo "Please run: make build"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Function to run a test scenario
|
||||||
|
run_test() {
|
||||||
|
local test_name="$1"
|
||||||
|
local scenario="$2"
|
||||||
|
local expected_keywords="$3"
|
||||||
|
|
||||||
|
echo -e "${YELLOW}📋 Test: $test_name${NC}"
|
||||||
|
echo "Scenario: $scenario"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
# Run the agent with the scenario
|
||||||
|
echo "$scenario" | timeout 120s "$AGENT_BINARY" > "$TEST_LOG" 2>&1 || true
|
||||||
|
|
||||||
|
# Check if any expected keywords are found in the output
|
||||||
|
local found_keywords=0
|
||||||
|
IFS=',' read -ra KEYWORDS <<< "$expected_keywords"
|
||||||
|
for keyword in "${KEYWORDS[@]}"; do
|
||||||
|
keyword=$(echo "$keyword" | xargs) # trim whitespace
|
||||||
|
if grep -qi "$keyword" "$TEST_LOG"; then
|
||||||
|
echo -e "${GREEN} ✅ Found expected keyword: $keyword${NC}"
|
||||||
|
((found_keywords++))
|
||||||
|
else
|
||||||
|
echo -e "${RED} ❌ Missing keyword: $keyword${NC}"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
# Show summary
|
||||||
|
if [[ $found_keywords -gt 0 ]]; then
|
||||||
|
echo -e "${GREEN} ✅ Test PASSED ($found_keywords keywords found)${NC}"
|
||||||
|
else
|
||||||
|
echo -e "${RED} ❌ Test FAILED (no expected keywords found)${NC}"
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Full output saved to: $TEST_LOG"
|
||||||
|
echo "----------------------------------------"
|
||||||
|
echo ""
|
||||||
|
}
|
||||||
|
|
||||||
|
# Test Scenario 1: Disk Space Issues (Inode Exhaustion)
|
||||||
|
run_test "Disk Space - Inode Exhaustion" \
|
||||||
|
"I cannot create new files in /home directory even though df -h shows plenty of space available. Getting 'No space left on device' error when trying to touch new files." \
|
||||||
|
"inode,df -i,filesystem,inodes,exhausted"
|
||||||
|
|
||||||
|
# Test Scenario 2: Memory Issues
|
||||||
|
run_test "Memory Issues - OOM Killer" \
|
||||||
|
"My applications keep getting killed randomly and I see 'killed' messages in logs. The system becomes unresponsive for a few seconds before recovering. This happens especially when running memory-intensive tasks." \
|
||||||
|
"memory,oom,killed,dmesg,free,swap"
|
||||||
|
|
||||||
|
# Test Scenario 3: Network Connectivity Issues
|
||||||
|
run_test "Network Connectivity - DNS Resolution" \
|
||||||
|
"I can ping IP addresses directly (like 8.8.8.8) but cannot resolve domain names. Web browsing fails with DNS resolution errors, but ping 8.8.8.8 works fine." \
|
||||||
|
"dns,resolv.conf,nslookup,nameserver,dig"
|
||||||
|
|
||||||
|
# Test Scenario 4: Service/Process Issues
|
||||||
|
run_test "Service Issues - High Load" \
|
||||||
|
"System load average is consistently above 10.0 even when CPU usage appears normal. Applications are responding slowly and I notice high wait times. The server feels sluggish overall." \
|
||||||
|
"load,average,cpu,iostat,vmstat,processes"
|
||||||
|
|
||||||
|
# Test Scenario 5: File System Issues
|
||||||
|
run_test "Filesystem Issues - Permission Problems" \
|
||||||
|
"Web server returns 403 Forbidden errors for all pages. Files exist and seem readable, but nginx logs show permission denied errors. SELinux is disabled and file permissions look correct." \
|
||||||
|
"permission,403,nginx,chmod,chown,selinux"
|
||||||
|
|
||||||
|
# Test Scenario 6: Boot/System Issues
|
||||||
|
run_test "Boot Issues - Kernel Module" \
|
||||||
|
"System boots but some hardware devices are not working. Network interface shows as down, USB devices are not recognized, and dmesg shows module loading failures." \
|
||||||
|
"module,lsmod,dmesg,hardware,interface,usb"
|
||||||
|
|
||||||
|
# Test Scenario 7: Performance Issues
|
||||||
|
run_test "Performance Issues - I/O Bottleneck" \
|
||||||
|
"Database queries are extremely slow, taking 30+ seconds for simple SELECT statements. Disk activity LED is constantly on and system feels unresponsive during database operations." \
|
||||||
|
"iostat,iotop,disk,database,slow,performance"
|
||||||
|
|
||||||
|
echo -e "${BLUE}🏁 Integration Tests Complete${NC}"
|
||||||
|
echo ""
|
||||||
|
echo "Check individual test logs in: $TEST_DIR"
|
||||||
|
echo ""
|
||||||
|
echo -e "${YELLOW}💡 Tips:${NC}"
|
||||||
|
echo "- Tests use realistic scenarios that could occur on production systems"
|
||||||
|
echo "- Each test expects the AI to suggest relevant diagnostic commands"
|
||||||
|
echo "- Review the full logs to see the complete diagnostic conversation"
|
||||||
|
echo "- Tests timeout after 120 seconds to prevent hanging"
|
||||||
|
echo "- Make sure NANNYAPI_ENDPOINT and NANNYAPI_MODEL are set correctly"
|
||||||
46
main.go
Normal file
46
main.go
Normal file
@@ -0,0 +1,46 @@
|
|||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"bufio"
|
||||||
|
"fmt"
|
||||||
|
"log"
|
||||||
|
"os"
|
||||||
|
"strings"
|
||||||
|
)
|
||||||
|
|
||||||
|
func main() {
|
||||||
|
// Initialize the agent
|
||||||
|
agent := NewLinuxDiagnosticAgent()
|
||||||
|
|
||||||
|
// Start the interactive session
|
||||||
|
fmt.Println("Linux Diagnostic Agent Started")
|
||||||
|
fmt.Println("Enter a system issue description (or 'quit' to exit):")
|
||||||
|
|
||||||
|
scanner := bufio.NewScanner(os.Stdin)
|
||||||
|
for {
|
||||||
|
fmt.Print("> ")
|
||||||
|
if !scanner.Scan() {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
input := strings.TrimSpace(scanner.Text())
|
||||||
|
if input == "quit" || input == "exit" {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
if input == "" {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
// Process the issue
|
||||||
|
if err := agent.DiagnoseIssue(input); err != nil {
|
||||||
|
fmt.Printf("Error: %v\n", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if err := scanner.Err(); err != nil {
|
||||||
|
log.Fatal(err)
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Println("Goodbye!")
|
||||||
|
}
|
||||||
154
system_info.go
Normal file
154
system_info.go
Normal file
@@ -0,0 +1,154 @@
|
|||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"net"
|
||||||
|
"runtime"
|
||||||
|
"strings"
|
||||||
|
"time"
|
||||||
|
)
|
||||||
|
|
||||||
|
// SystemInfo represents basic system information
|
||||||
|
type SystemInfo struct {
|
||||||
|
Hostname string `json:"hostname"`
|
||||||
|
OS string `json:"os"`
|
||||||
|
Kernel string `json:"kernel"`
|
||||||
|
Architecture string `json:"architecture"`
|
||||||
|
CPUCores string `json:"cpu_cores"`
|
||||||
|
Memory string `json:"memory"`
|
||||||
|
Uptime string `json:"uptime"`
|
||||||
|
PrivateIPs string `json:"private_ips"`
|
||||||
|
LoadAverage string `json:"load_average"`
|
||||||
|
DiskUsage string `json:"disk_usage"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// GatherSystemInfo collects basic system information
|
||||||
|
func GatherSystemInfo() *SystemInfo {
|
||||||
|
info := &SystemInfo{}
|
||||||
|
executor := NewCommandExecutor(5 * time.Second)
|
||||||
|
|
||||||
|
// Basic system info
|
||||||
|
if result := executor.Execute(Command{ID: "hostname", Command: "hostname"}); result.ExitCode == 0 {
|
||||||
|
info.Hostname = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result := executor.Execute(Command{ID: "os", Command: "lsb_release -d 2>/dev/null | cut -f2 || cat /etc/os-release | grep PRETTY_NAME | cut -d'=' -f2 | tr -d '\"'"}); result.ExitCode == 0 {
|
||||||
|
info.OS = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result := executor.Execute(Command{ID: "kernel", Command: "uname -r"}); result.ExitCode == 0 {
|
||||||
|
info.Kernel = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result := executor.Execute(Command{ID: "arch", Command: "uname -m"}); result.ExitCode == 0 {
|
||||||
|
info.Architecture = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result := executor.Execute(Command{ID: "cores", Command: "nproc"}); result.ExitCode == 0 {
|
||||||
|
info.CPUCores = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result := executor.Execute(Command{ID: "memory", Command: "free -h | grep Mem | awk '{print $2}'"}); result.ExitCode == 0 {
|
||||||
|
info.Memory = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result := executor.Execute(Command{ID: "uptime", Command: "uptime -p"}); result.ExitCode == 0 {
|
||||||
|
info.Uptime = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result := executor.Execute(Command{ID: "load", Command: "uptime | awk -F'load average:' '{print $2}' | xargs"}); result.ExitCode == 0 {
|
||||||
|
info.LoadAverage = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
if result := executor.Execute(Command{ID: "disk", Command: "df -h / | tail -1 | awk '{print \"Root: \" $3 \"/\" $2 \" (\" $5 \" used)\"}'"}); result.ExitCode == 0 {
|
||||||
|
info.DiskUsage = strings.TrimSpace(result.Output)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get private IP addresses
|
||||||
|
info.PrivateIPs = getPrivateIPs()
|
||||||
|
|
||||||
|
return info
|
||||||
|
}
|
||||||
|
|
||||||
|
// getPrivateIPs returns private IP addresses
|
||||||
|
func getPrivateIPs() string {
|
||||||
|
var privateIPs []string
|
||||||
|
|
||||||
|
interfaces, err := net.Interfaces()
|
||||||
|
if err != nil {
|
||||||
|
return "Unable to determine"
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, iface := range interfaces {
|
||||||
|
if iface.Flags&net.FlagUp == 0 || iface.Flags&net.FlagLoopback != 0 {
|
||||||
|
continue // Skip down or loopback interfaces
|
||||||
|
}
|
||||||
|
|
||||||
|
addrs, err := iface.Addrs()
|
||||||
|
if err != nil {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, addr := range addrs {
|
||||||
|
if ipnet, ok := addr.(*net.IPNet); ok && !ipnet.IP.IsLoopback() {
|
||||||
|
if isPrivateIP(ipnet.IP) {
|
||||||
|
privateIPs = append(privateIPs, fmt.Sprintf("%s (%s)", ipnet.IP.String(), iface.Name))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if len(privateIPs) == 0 {
|
||||||
|
return "No private IPs found"
|
||||||
|
}
|
||||||
|
|
||||||
|
return strings.Join(privateIPs, ", ")
|
||||||
|
}
|
||||||
|
|
||||||
|
// isPrivateIP checks if an IP address is private
|
||||||
|
func isPrivateIP(ip net.IP) bool {
|
||||||
|
// RFC 1918 private address ranges
|
||||||
|
private := []string{
|
||||||
|
"10.0.0.0/8",
|
||||||
|
"172.16.0.0/12",
|
||||||
|
"192.168.0.0/16",
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, cidr := range private {
|
||||||
|
_, subnet, _ := net.ParseCIDR(cidr)
|
||||||
|
if subnet.Contains(ip) {
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
|
// FormatSystemInfoForPrompt formats system information for inclusion in diagnostic prompts
|
||||||
|
func FormatSystemInfoForPrompt(info *SystemInfo) string {
|
||||||
|
return fmt.Sprintf(`SYSTEM INFORMATION:
|
||||||
|
- Hostname: %s
|
||||||
|
- Operating System: %s
|
||||||
|
- Kernel Version: %s
|
||||||
|
- Architecture: %s
|
||||||
|
- CPU Cores: %s
|
||||||
|
- Total Memory: %s
|
||||||
|
- System Uptime: %s
|
||||||
|
- Current Load Average: %s
|
||||||
|
- Root Disk Usage: %s
|
||||||
|
- Private IP Addresses: %s
|
||||||
|
- Go Runtime: %s
|
||||||
|
|
||||||
|
ISSUE DESCRIPTION:`,
|
||||||
|
info.Hostname,
|
||||||
|
info.OS,
|
||||||
|
info.Kernel,
|
||||||
|
info.Architecture,
|
||||||
|
info.CPUCores,
|
||||||
|
info.Memory,
|
||||||
|
info.Uptime,
|
||||||
|
info.LoadAverage,
|
||||||
|
info.DiskUsage,
|
||||||
|
info.PrivateIPs,
|
||||||
|
runtime.Version())
|
||||||
|
}
|
||||||
82
test-examples.sh
Executable file
82
test-examples.sh
Executable file
@@ -0,0 +1,82 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Linux Diagnostic Agent - Test Scenarios
|
||||||
|
# Realistic Linux problems for testing the diagnostic agent
|
||||||
|
|
||||||
|
echo "🔧 Linux Diagnostic Agent - Test Scenarios"
|
||||||
|
echo "==========================================="
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "📚 Available test scenarios (copy-paste into the agent):"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "1. 💾 DISK SPACE ISSUES (Inode Exhaustion):"
|
||||||
|
echo "────────────────────────────────────────────"
|
||||||
|
echo "I cannot create new files in /home directory even though df -h shows plenty of space available. Getting 'No space left on device' error when trying to touch new files."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "2. 🧠 MEMORY ISSUES (OOM Killer):"
|
||||||
|
echo "─────────────────────────────────"
|
||||||
|
echo "My applications keep getting killed randomly and I see 'killed' messages in logs. The system becomes unresponsive for a few seconds before recovering. This happens especially when running memory-intensive tasks."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "3. 🌐 NETWORK CONNECTIVITY (DNS Resolution):"
|
||||||
|
echo "─────────────────────────────────────────────"
|
||||||
|
echo "I can ping IP addresses directly (like 8.8.8.8) but cannot resolve domain names. Web browsing fails with DNS resolution errors, but ping 8.8.8.8 works fine."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "4. ⚡ PERFORMANCE ISSUES (High Load):"
|
||||||
|
echo "───────────────────────────────────"
|
||||||
|
echo "System load average is consistently above 10.0 even when CPU usage appears normal. Applications are responding slowly and I notice high wait times. The server feels sluggish overall."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "5. 🚫 WEB SERVER ISSUES (Permission Problems):"
|
||||||
|
echo "──────────────────────────────────────────────"
|
||||||
|
echo "Web server returns 403 Forbidden errors for all pages. Files exist and seem readable, but nginx logs show permission denied errors. SELinux is disabled and file permissions look correct."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "6. 🖥️ HARDWARE/BOOT ISSUES (Kernel Module):"
|
||||||
|
echo "─────────────────────────────────────────────"
|
||||||
|
echo "System boots but some hardware devices are not working. Network interface shows as down, USB devices are not recognized, and dmesg shows module loading failures."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "7. 🐌 DATABASE PERFORMANCE (I/O Bottleneck):"
|
||||||
|
echo "─────────────────────────────────────────────"
|
||||||
|
echo "Database queries are extremely slow, taking 30+ seconds for simple SELECT statements. Disk activity LED is constantly on and system feels unresponsive during database operations."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "8. 🔥 HIGH CPU USAGE (Process Analysis):"
|
||||||
|
echo "────────────────────────────────────────"
|
||||||
|
echo "System is running slow and CPU usage is constantly at 100%. Top shows high CPU usage but I can't identify which specific process or thread is causing the issue."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "9. 📁 FILE SYSTEM CORRUPTION:"
|
||||||
|
echo "────────────────────────────"
|
||||||
|
echo "Getting 'Input/output error' when accessing certain files and directories. Some files appear corrupted and applications crash when trying to read specific data files."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "10. 🔌 SERVICE STARTUP FAILURES:"
|
||||||
|
echo "───────────────────────────────"
|
||||||
|
echo "Critical services fail to start after system reboot. Systemctl shows services in failed state but error messages are unclear. System appears to boot normally otherwise."
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "🚀 Quick Start:"
|
||||||
|
echo "──────────────"
|
||||||
|
echo "1. Run: ./nanny-agent"
|
||||||
|
echo "2. Copy-paste any scenario above when prompted"
|
||||||
|
echo "3. Watch the AI diagnose the problem step by step"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "🧪 Automated Testing:"
|
||||||
|
echo "────────────────────"
|
||||||
|
echo "Run integration tests: ./integration-tests.sh"
|
||||||
|
echo "This will test all scenarios automatically"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
echo "💡 Pro Tips:"
|
||||||
|
echo "───────────"
|
||||||
|
echo "- Each scenario is based on real-world Linux issues"
|
||||||
|
echo "- The AI will gather system info automatically"
|
||||||
|
echo "- Diagnostic commands are executed safely (read-only)"
|
||||||
|
echo "- You'll get a detailed resolution plan at the end"
|
||||||
|
echo "- Set NANNYAPI_ENDPOINT and NANNYAPI_MODEL before running"
|
||||||
Reference in New Issue
Block a user