Project Overview

Positioning

This project exposes Grafana Loki's log querying capabilities to AI assistants through MCP (Model Context Protocol), enabling operations personnel to query logs using natural language instead of LogQL.

Technology Stack

LayerTechnologyVersion
LanguageGo1.24+
MCP SDKgithub.com/mark3labs/mcp-gov0.32.0
Log StorageGrafana Loki2.9.0
Log CollectionPromtail2.9.0
VisualizationGrafanalatest
ContainerizationDocker + Compose-

Core Capabilities (3 MCP Tools)

ToolPurposeLoki API
loki_queryExecute LogQL queries/loki/api/v1/query_range
loki_label_namesGet all label names/loki/api/v1/labels
loki_label_valuesGet label value list/loki/api/v1/label/{name}/values

Architecture Design

Overall Architecture

┌─────────────────┐     ┌──────────────────┐     ┌──────────────┐
│  AI Client      │     │  Loki MCP Server │     │  Grafana     │
│  (Claude Code/  │────→│     :8080        │────→│  Loki        │
│   Desktop/      │ MCP │                  │ HTTP│  :3100       │
│   Cursor)       │     │  3 Transport     │     │              │
└─────────────────┘     │  Protocols:      │     └──────────────┘
                        │  - stdio         │
                        │  - SSE (/sse)    │
                        │  - HTTP (/stream)│
                        │                  │
                        │  /healthz (K8s)  │
                        └──────────────────┘

Three Transport Protocols (Coexisting on Same Port)

ProtocolEndpointScenario
stdioStandard Input/OutputLocal binary / Docker integration, Claude Desktop direct process launch
SSE/sse + /mcpServer-Sent Events, remote connection (legacy protocol)
Streamable HTTP/streamNew generation MCP remote protocol (recommended)

Design Key Points: Register both SSE and Streamable HTTP to the same port through http.ServeMux, with stdio running in parallel as a background goroutine.

Project Structure

loki-mcp/
├── cmd/
│   ├── server/main.go          # Entry: Register Tools + Start 3 transports
│   └── client/main.go          # JSON-RPC test client
├── internal/
│   └── handlers/
│       ├── loki.go             # Core: 3 Tools complete implementation (993 lines)
│       └── loki_test.go        # Unit tests (261 lines)
├── pkg/
│   └── utils/logger.go         # Simple logging utility
├── grafana/
│   └── provisioning/datasources/loki.yaml  # Grafana datasource pre-configuration
├── promtail/
│   └── config.yml              # Log collection configuration
├── examples/
│   ├── claude-desktop/         # 4 Claude Desktop configuration examples
│   ├── claude-code-commands/   # Slash Command templates
│   ├── simple-sse-client.html  # SSE test page
│   └── sse-client.html         # Complete SSE client
├── docker-compose.yml          # Local 5-service development environment
├── Dockerfile                  # Multi-stage build
├── Makefile                    # Build/Test/Run
├── go.mod / go.sum
├── run-mcp-server.sh           # Startup script
├── test-loki-query.sh          # Query test script
├── insert-loki-logs.sh         # Insert test logs
└── README.md

Implementation Guide from Scratch

Phase 1: Basic Framework Setup

1.1 Initialize Go Project

mkdir loki-mcp && cd loki-mcp
go mod init github.com/yourname/loki-mcp
go get github.com/mark3labs/mcp-go@v0.32.0

1.2 Understand MCP Server Entry Point (cmd/server/main.go)

Core pattern: Create Server → Register Tools → Start Transport Layer

// 1. Create MCP Server instance
s := server.NewMCPServer(
    "Loki MCP Server", "0.1.0",
    server.WithResourceCapabilities(true, true),
    server.WithLogging(),
)

// 2. Register tools (Tool definition + Handler function)
lokiQueryTool := handlers.NewLokiQueryTool()
s.AddTool(lokiQueryTool, handlers.HandleLokiQuery)

// 3. Create transport layer
sseServer := server.NewSSEServer(s,
    server.WithSSEEndpoint("/sse"),
    server.WithMessageEndpoint("/mcp"),
)
streamableServer := server.NewStreamableHTTPServer(s)

// 4. Unified routing
mux := http.NewServeMux()
mux.Handle("/sse", sseServer)
mux.Handle("/mcp", sseServer)
mux.Handle("/stream", streamableServer)
mux.HandleFunc("/healthz", healthHandler)

// 5. Parallel startup: HTTP + stdio
go http.ListenAndServe(":8080", mux)
go server.ServeStdio(s)

// 6. Graceful shutdown
stop := make(chan os.Signal, 1)
signal.Notify(stop, os.Interrupt, syscall.SIGTERM)
<-stop

Key Design Decisions:

  • Three transport protocols on same port: Simplifies deployment—one port handles all clients
  • stdio background operation: Compatible with Claude Desktop's process mode
  • /healthz endpoint: Adapts to K8s readiness/liveness probes

1.3 Understand Tool Definition Pattern

Each Tool consists of two parts:

  1. Tool Definition Function (NewXxxTool()) — Declares parameter schema
  2. Handler Function (HandleXxx()) — Processes request logic
// Tool Definition: Declare parameters, types, default values, descriptions
func NewLokiQueryTool() mcp.Tool {
    return mcp.NewTool("loki_query",
        mcp.WithDescription("Run a query against Grafana Loki"),
        mcp.WithString("query", mcp.Required(), mcp.Description("LogQL query string")),
        mcp.WithString("url", mcp.Description("Loki server URL"), mcp.DefaultString(lokiURL)),
        mcp.WithString("start", mcp.Description("Start time (default: 1h ago)")),
        mcp.WithNumber("limit", mcp.Description("Max entries (default: 100)")),
        mcp.WithString("format", mcp.DefaultString("raw")),
        // ... authentication parameters
    )
}

// Handler: Extract parameters → Build request → Call Loki API → Format output
func HandleLokiQuery(ctx context.Context, request mcp.CallToolRequest) (*mcp.CallToolResult, error) {
    args := request.GetArguments()
    // ... processing logic
    return mcp.NewToolResultText(formattedResult), nil
}

Phase 2: Core Logic Implementation

2.1 Request Processing Flow (internal/handlers/loki.go)

Parameter Extraction → Environment Variable Fallback → Time Parsing → URL Building → HTTP Request → Response Parsing → Formatted Output

2.2 Parameter Extraction Pattern (with Environment Variable Fallback)

Each parameter follows the priority chain: Request Parameter > Environment Variable > Default Value

// Unified pattern: Check request parameter first, then environment variable
var lokiURL string
if urlArg, ok := args["url"].(string); ok && urlArg != "" {
    lokiURL = urlArg
} else {
    lokiURL = os.Getenv("LOKI_URL")
    if lokiURL == "" {
        lokiURL = "http://localhost:3100"
    }
}

Environment Variable List:

VariablePurposeDefault
LOKI_URLLoki addresshttp://localhost:3100
LOKI_ORG_IDTenant IDEmpty
LOKI_USERNAMEBasic Auth usernameEmpty
LOKI_PASSWORDBasic Auth passwordEmpty
LOKI_TOKENBearer TokenEmpty
PORTService port8080

2.3 Time Parsing (parseTime)

Supports multiple input formats, attempting in sequence:

func parseTime(timeStr string) (time.Time, error) {
    // 1. "now" keyword
    if timeStr == "now" { return time.Now(), nil }
    
    // 2. Relative time: "-1h", "-30m"
    if timeStr[0] == '-' {
        duration, err := time.ParseDuration(timeStr)
        if err == nil { return time.Now().Add(duration), nil }
    }
    
    // 3. RFC3339: "2024-01-15T10:30:45Z"
    // 4. ISO variants: "2006-01-02T15:04:05", "2006-01-02 15:04:05"
    // 5. Date only: "2006-01-02"
}

2.4 URL Building

Intelligent path concatenation, handling various base URL formats:

func buildLokiQueryURL(baseURL, query string, start, end int64, limit int) (string, error) {
    u, _ := url.Parse(baseURL)
    
    // Path normalization: avoid duplicate concatenation
    if !strings.Contains(u.Path, "loki/api/v1") {
        u.Path = "/loki/api/v1/query_range"
    }
    
    // Query parameters
    q := u.Query()
    q.Set("query", query)              // LogQL
    q.Set("start", fmt.Sprintf("%d", start))  // Unix seconds
    q.Set("end", fmt.Sprintf("%d", end))
    q.Set("limit", fmt.Sprintf("%d", limit))
    u.RawQuery = q.Encode()
    return u.String(), nil
}

2.5 Authentication Mechanism

Three-level authentication with priority: Bearer Token > Basic Auth > No Authentication

if token != "" {
    req.Header.Add("Authorization", "Bearer "+token)
} else if username != "" || password != "" {
    req.SetBasicAuth(username, password)
}

// Multi-tenant isolation (always add if value exists)
if orgID != "" {
    req.Header.Add("X-Scope-OrgID", orgID)
}

2.6 Response Data Structure

type LokiResult struct {
    Status string   `json:"status"`  // "success" | "error"
    Data   LokiData `json:"data"`
    Error  string   `json:"error,omitempty"`
}

type LokiData struct {
    ResultType string      `json:"resultType"`  // "streams"
    Result     []LokiEntry `json:"result"`
}

type LokiEntry struct {
    Stream map[string]string `json:"stream"`  // Labels: {job: "xx", pod: "xx"}
    Values [][]string        `json:"values"`  // [[nanosecond timestamp, log line], ...]
}

2.7 Three Output Formats

FormatPurposeExample
raw (default)AI parsing friendly, most compact2024-01-15T10:30:45Z {job=api} Request received
jsonProgrammatic processingComplete JSON structure
textHuman readableNumbered Stream + timestamped log lines

2.8 Known Bugs and Fixes

Year 2262 Timestamp Bug: Early implementations used time.Unix(ts, 0) treating nanoseconds as seconds when Loki returns nanosecond timestamps, causing display of year 2262.

Fix: time.Unix(0, int64(ts)) — First parameter 0 seconds, second parameter nanoseconds.

Phase 3: Dockerization and Local Environment

3.1 Dockerfile (Multi-Stage Build)

# Stage 1: Build
FROM golang:1.24-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download  # Utilize cache layer
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o loki-mcp-server ./cmd/server

# Stage 2: Runtime (minimal image)
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/loki-mcp-server .
EXPOSE 8080
ENTRYPOINT ["./loki-mcp-server"]

Key Points:

  • CGO_ENABLED=0: Static linking, no glibc dependency
  • COPY go.mod/go.sum first → go mod download: Utilize Docker layer caching to accelerate builds
  • Final image based on alpine:latest: Minimize attack surface

3.2 Docker Compose (5-Service Complete Environment)

services:
  loki-mcp-server:  # MCP Server :8080
    depends_on:
      loki:
        condition: service_healthy  # Wait for Loki ready
  
  loki:  # Log storage :3100
    healthcheck:  # /ready endpoint check
      test: ["CMD-SHELL", "wget -q --spider http://localhost:3100/ready || exit 1"]
  
  grafana:  # Visualization :3000
    volumes:
      - ./grafana/provisioning:/etc/grafana/provisioning  # Pre-configured datasources
  
  promtail:  # Log collection
    volumes:
      - /var/log:/var/log  # Collect host logs
      - /var/run/docker.sock:/var/run/docker.sock  # Collect container logs
  
  log-generator:  # Test log generator
    command: |  # Generate INFO/ERROR logs every 5 seconds
      while true; do echo "INFO: ..."; sleep 5; done

Service Dependency Chain: log-generator → promtail → loki → loki-mcp-server, Grafana operates independently.

Phase 4: Testing Strategy

4.1 Unit Testing

Focus coverage on timestamp parsing (bug-prone area):

func TestFormatLokiResults_NoYear2262Bug(t *testing.T) {
    testCases := []struct {
        name         string
        timestampNs  string
        expectedYear string
    }{
        {"Current", "1705312245000000000", "2024"},  // 2024-01-15
        {"Recent", "1700000000000000000", "2023"},   // 2023-11-14
        {"Future", "1800000000000000000", "2027"},   // 2027-01-11
    }
    
    for _, tc := range testCases {
        t.Run(tc.name, func(t *testing.T) {
            // Build LokiResult → formatLokiResults → Assert year
        })
    }
}

Test Case Coverage:

  • Normal timestamp parsing
  • Multiple log line timestamps
  • Illegal timestamp fallback
  • Empty result handling
  • Current time regression
  • Year 2262 bug regression (table-driven, 3 time points)

4.2 Running Tests

make test                          # All tests
go test -coverprofile=coverage.out ./...  # With coverage
go tool cover -func=coverage.out   # View coverage
go test -race ./...                # Race detection

4.3 Integration Test Scripts

# Insert test logs
./insert-loki-logs.sh --num 20 --job "custom-job" --app "my-app"

# Query verification
./test-loki-query.sh '{job="varlogs"}'
./test-loki-query.sh '{job="varlogs"} |= "ERROR"' '-1h' 'now' 50

Phase 5: Deployment and Client Integration

5.1 Deployment Method Selection

MethodScenarioCommand
Local binaryDevelopment debuggingmake run
Docker single containerExisting Loki instancedocker run -p 8080:8080 -e LOKI_URL=... loki-mcp-server
Docker ComposeComplete local environmentdocker-compose up --build
K8s DeploymentProduction environmentSee manifest below
Remote URLTeam sharinghttps://loki-mcp.loki.com/stream

5.2 K8s Deployment Recommendations

apiVersion: apps/v1
kind: Deployment
metadata:
  name: loki-mcp-server
spec:
  replicas: 2  # Stateless, can scale horizontally
  template:
    spec:
      containers:
      - name: loki-mcp-server
        image: loki-mcp-server:v0.1.0  # Fixed version
        ports:
        - containerPort: 8080
        env:
        - name: LOKI_URL
          value: "http://loki-gateway.monitoring:3100"
        - name: LOKI_TOKEN
          valueFrom:
            secretKeyRef:
              name: loki-auth
              key: token
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          periodSeconds: 30
        resources:
          requests:
            cpu: 50m
            memory: 64Mi
          limits:
            cpu: 200m
            memory: 128Mi

5.3 Client Integration

Claude Code (Recommended Streamable HTTP):

claude mcp add --transport http --scope user loki https://loki-mcp.loki.com/stream

Claude Desktop (Local Docker):

{
  "mcpServers": {
    "loki": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "-e", "LOKI_URL=http://host.docker.internal:3100", "loki-mcp-server:latest"]
    }
  }
}

Cursor:

{
  "mcpServers": {
    "loki": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "-e", "LOKI_URL=http://host.docker.internal:3100", "loki-mcp-server:latest"]
    }
  }
}

Development Command Quick Reference

# Build
make build                    # Compile local binary
make build-linux              # Cross-compile Linux (amd64)
docker build -t loki-mcp-server .  # Build image

# Run
make run                      # Local execution
docker-compose up --build     # Complete local environment

# Test
make test                     # Unit tests
go test -race ./...           # Race detection
./test-loki-query.sh          # Integration test
./insert-loki-logs.sh         # Insert test data

# Dependencies
make deps                     # Download dependencies
make tidy                     # Organize go.mod

# Cleanup
make clean                    # Delete binary
docker-compose down -v        # Clean containers and volumes

Code Standards

Go Standards

  • Format with gofmt / goimports
  • Error handling: fmt.Errorf("context: %w", err) wrapping
  • Environment variables defined with const to avoid hardcoded strings
  • HTTP clients must set Timeout (currently 30s)
  • Use context.Context to pass request context

MCP Tool Development Standards

  • Tool names use snake_case (e.g., loki_query)
  • Required parameters marked with mcp.Required()
  • Every parameter must have mcp.Description() including default value explanation
  • Environment variable fallback must be implemented in Handler (not just in Tool definition)
  • Return mcp.NewToolResultText() as standard response

Adding New Tools Template

If you need to add a new tool (such as loki_series), follow these steps:

  1. In internal/handlers/loki.go, add:

    • NewLokiSeriesTool() — Define parameters
    • HandleLokiSeries() — Implement logic
    • buildLokiSeriesURL() — URL building
    • executeLokiSeriesQuery() — HTTP request
    • formatLokiSeriesResults() — Format output
  2. Register in cmd/server/main.go:

    seriesTool := handlers.NewLokiSeriesTool()
    s.AddTool(seriesTool, handlers.HandleLokiSeries)
  3. Add tests in internal/handlers/loki_test.go

Operations Focus Points

Observability

  • /healthz endpoint returns ok, adapts to K8s probes
  • Service itself is stateless, no persistent storage required
  • Startup logs output all endpoint addresses
  • Recommendation: Add Prometheus metrics endpoint (/metrics) for production

Security

  • Authentication information injected through environment variables, not written to image layers
  • Bearer Token takes priority over Basic Auth
  • Multi-tenant isolation through X-Scope-OrgID header
  • HTTP timeout 30s prevents slow query blocking

Extension Directions

DirectionDescription
Add /metricsPrometheus metrics exposure
Add loki_series ToolQuery series metadata
Add loki_stats ToolQuery ingester statistics
Extract authentication parameters common functionEliminate duplicate code in 3 Handlers
Add request logging middlewareLog each Tool call's query and duration
Support TLSHTTPS termination or certificate configuration
Add rate limitingPrevent AI frequent queries

Learning Path Recommendations

Day 1: Get it running
├── docker-compose up --build
├── Access Grafana :3000 to understand Loki data structure
├── ./insert-loki-logs.sh to insert test data
└── ./test-loki-query.sh to verify queries

Day 2: Understand entry point
├── cmd/server/main.go (100 lines) — MCP registration + transport layer
├── Understand mcp-go SDK's AddTool pattern
└── Understand three-protocol same-port architecture

Day 3: Deep dive into core
├── internal/handlers/loki.go (993 lines)
├── Parameter extraction → environment variable fallback pattern
├── parseTime multi-format time parsing
├── buildLokiQueryURL path normalization
├── executeLokiQuery HTTP + authentication
└── formatLokiResults three output formats

Day 4: Testing and bugs
├── internal/handlers/loki_test.go
├── Nanosecond timestamp year 2262 bug causes and effects
└── Go table-driven test style

Day 5: Docker + deployment
├── Dockerfile multi-stage build
├── docker-compose.yml service orchestration
├── K8s deployment manifest design
└── Client integration configuration

Day 6: Hands-on extension
├── Try adding loki_series Tool
├── Extract authentication parameters common function (eliminate duplication)
└── Add /metrics endpoint

Common Issues and Solutions

Connection Failures

  1. Run claude mcp get loki to check configuration
  2. Confirm network connectivity
  3. Check HTTPS certificates

Query Returns No Results

  1. Confirm Loki has data for corresponding time range
  2. Check org_id in multi-tenant scenarios
  3. Use loki_label_names to first check available labels

Docker Environment Issues

  1. Loki startup takes time, wait for healthcheck to pass
  2. On Mac, use host.docker.internal for Docker to access host
  3. docker-compose down -v can clean data and restart

Timestamp Display Anomalies

  • Confirm using time.Unix(0, int64(ns)) not time.Unix(ns, 0)
  • Loki returns nanosecond timestamps, not seconds

Conclusion

The Loki MCP Server project demonstrates how to bridge traditional observability tools with modern AI assistants through the Model Context Protocol. By exposing LogQL query capabilities through natural language interfaces, operations teams can dramatically reduce the learning curve for log analysis while maintaining the full power of Loki's query engine.

The three-transport architecture provides flexibility for various deployment scenarios, from local development to production Kubernetes clusters. The comprehensive testing strategy, including the specific regression tests for the nanosecond timestamp bug, ensures reliability in production environments.

This implementation serves as a reference architecture for building MCP servers that integrate existing tools with AI-powered workflows, establishing patterns that can be applied to numerous other observability and infrastructure tools.