Architecture
System Context
Container Diagram
Overview
grey-seal is a single-binary Go service (cmd/api) that exposes a Connect-RPC API over HTTP/2 (h2c). It uses a layered architecture: a thin gRPC/Connect handler layer delegates to domain service interfaces, which are backed by PostgreSQL repositories. LLM inference is delegated to a local Ollama instance; semantic search is delegated to the external shrike service. Resources are ingested asynchronously via Redpanda/Kafka: the API enqueues events that the worker process consumes to fetch content and forward it to shrike for chunking, embedding, and vector indexing.
Process Inventory
| Process | Source | Port | Notes |
|---|---|---|---|
| API server | cmd/api/main.go | 9000 | Active, ships in Dockerfile |
| Worker | cmd/worker/main.go | — | Kafka consumer; fetches web/PDF content and forwards to shrike |
| UI | cmd/ui/main.go | 8000 | //go:build ignore; excluded from normal builds |
Transport
The API server uses h2c (cleartext HTTP/2) via golang.org/x/net/http2/h2c, making it compatible with both native gRPC clients and the Connect-RPC grpc-web protocol. CORS is applied per-handler using connectrpc.com/cors helper headers, allowing wildcard origins.
Domain Services
Role service (lib/greyseal/role/)
Thin CRUD service around the roles table. No business logic beyond delegation to the repository. Exposes List, Get, Create, Update, Delete.
Resource service (lib/greyseal/resource/)
Manages resource metadata (title, URL, source type, timestamps). Exposes List, Get, Ingest, Delete.
Ingest assigns a UUID and created_at, persists the record, then triggers async indexing via the Indexer interface. When KAFKA_BROKERS is set the real KafkaIndexer is wired; otherwise the indexer is nil and indexing is skipped (graceful degradation).
KafkaIndexer publishes:
SOURCE_TEXT→shrikev1.TextExtractedEvent(topicv1.TextExtractedEvent) directly to shrike’s consumerSOURCE_WEBSITE/SOURCE_PDF→greysealv1.Resource(topicv1.Resource) to the worker queue
Conversation service (lib/greyseal/conversation/)
The core RAG orchestration service. Handles CRUD on conversations and the Chat method, which:
- Persist the incoming user
Message. - Load the
Conversationrecord (role_uuid,resource_uuids,summary). - If
role_uuidis set, fetch theRoleand prepend itssystem_promptas a system message. - Load prior message history. If history exceeds 10 messages, summarise the overflow via a second LLM call and persist the summary to
conversations.summary. Prepend the (existing or freshly generated) summary as a system message. - Retrieve relevant context via
contextSearch(cache-first): check the per-conversationResourceCachefirst; on a miss, call shrike (Searcher) withEntityUuidsfilter, then populate the cache. Format snippets as"N. [Title]: snippet"for source attribution. - Append message history and the current user turn.
- Call the LLM (
LLMinterface); stream each token via the Connect server-stream callback. - Persist the assistant response and update
conversations.updated_at.
SubmitFeedback writes -1/0/1 to messages.feedback.
ResourceCache (lib/repo/cache/RedisResourceCache) stores per-conversation resource snippets in Redis (key greyseal:conv:{uuid}:resources, TTL 24 h). Wired when REDIS_URL is set; nil otherwise (no caching).
Worker (cmd/worker/)
Consumes the v1.Resource Kafka topic via archaea/kafka.Consumer. For each resource:
- Calls
resource.FetchContentto retrieve the raw text (HTTP scrape for websites; placeholder for PDFs). - Publishes a
shrikev1.TextExtractedEventto shrike’s Kafka topic for chunking, embedding, and Qdrant indexing. - Updates
resources.indexed_atin PostgreSQL.
Requires KAFKA_BROKERS and DATABASE_URL environment variables.
Repository Layer (lib/repo/)
All repositories embed *Conn, which holds a *sql.DB. SQL is built with Masterminds/squirrel using the $N placeholder format. PostgreSQL arrays (TEXT[]) are handled with lib/pq.Array. Timestamps are stored as TIMESTAMP WITH TIME ZONE.
NewDatabase runs goose migrations automatically on startup from an embedded FS (//go:embed migrations/*.sql).
LLM Adapter (lib/repo/ollama/)
ollama.LLM implements conversation.LLM. It POSTs to Ollama’s /api/chat endpoint with "stream": true and reads newline-delimited JSON chunks, invoking the provided callback per token. Configuration is via OLLAMA_HOST and OLLAMA_CHAT_MODEL environment variables (defaults: http://localhost:11434, deepseek-r1).
Search Adapter
shrikeSearcher implements conversation.Searcher by calling shrikeconnect.SearchServiceClient.Search with mode: "hybrid" and a SearchFilter.EntityUuids field when the conversation is scoped to specific resources. Server-side filtering eliminates the need for a client-side loop.
UI (lib/ui/, cmd/ui/)
All UI files carry //go:build ignore and are excluded from normal compilation. The UI is a WebAssembly single-page application built with go-app v9 (Pico CSS for styling). It exposes routes for Messages, Conversations, Resources, and Roles with full CRUD pages.
CLI (cmd/)
The root Cobra command is grey-seal. The only active subcommand is ingest. The CRUD command files (conversation_cmd.go, resource_cmd.go, role_cmd.go) also carry //go:build ignore and are not compiled.
External Dependencies (key)
| Package | Role |
|---|---|
connectrpc.com/connect | Connect-RPC server and client |
connectrpc.com/cors | CORS headers for Connect |
github.com/holmes89/archaea | Generic base types + Kafka producer/consumer |
github.com/holmes89/shrike | External vector search + text extraction service |
github.com/redis/go-redis/v9 | Redis client for resource snippet cache |
github.com/Masterminds/squirrel | SQL query builder |
github.com/pressly/goose/v3 | Database migrations |
github.com/spf13/cobra | CLI framework |
github.com/google/uuid | UUID generation |
github.com/lib/pq | PostgreSQL driver + array support |