Architecture

System Context

C4Context title System Context — Lynx within joel.holmes.haus Person(admin, "Admin", "Uses the joel.holmes.haus UI to manage feeds and websites") Boundary(platform, "joel.holmes.haus Platform") { System(ui, "joel.holmes.haus", "Go-app WASM admin SPA") System(lynx, "Lynx", "Web feed and website archiving service") System(magpie, "Magpie", "Central resource index — receives website resources from Lynx") System(shrike, "Shrike", "Search — indexes enriched website text from Lynx") } SystemDb(postgres, "PostgreSQL", "Feeds, websites, and website metadata") SystemDb(minio, "MinIO / S3", "Archived HTML and readable text blobs") SystemQueue(kafka, "Kafka", "magpie.v1.Resource · shrike.v1.TextExtractedEvent") Rel(admin, ui, "Uses") Rel(ui, lynx, "ConnectRPC") Rel(lynx, postgres, "Reads / writes") Rel(lynx, minio, "Stores archived content") Rel(lynx, kafka, "Publishes resource events") Rel(kafka, magpie, "magpie.v1.Resource") Rel(kafka, shrike, "TextExtractedEvent")

Container Diagram

C4Container title Lynx — Internal Containers Boundary(lynx, "Lynx") { Container(api, "cmd/api", "Go / ConnectRPC h2c :9000", "FeedService · WebsiteService · Enricher goroutine") Container(worker, "cmd/worker", "Go / Kafka", "FeedConsumer · WebsiteConsumer · Enricher pipeline goroutine") Container(ui, "cmd/ui", "Go-app WASM :8000", "Browser SPA — calls cmd/api via ConnectRPC HTTP client") Container(feedSvc, "feed.Service", "Go", "CRUD for feeds") Container(websiteSvc, "website.Service", "Go", "Create (publishes Kafka event) · CRUD") Container(enricher, "enrichment.Enricher", "Go / goroutine", "7-step pipeline: fetch → metadata → archive HTML → readable → archive → persist → magpie") ContainerDb(feedRepo, "FeedRepo", "PostgreSQL / squirrel", "feeds table") ContainerDb(websiteRepo, "WebsiteRepo + MetadataRepo", "PostgreSQL / squirrel", "websites · website_metadata tables") } SystemDb(postgres, "PostgreSQL", "") SystemDb(minio, "MinIO / S3", "websites/{uuid}/original.html · readable.txt") SystemQueue(kafka, "Kafka", "") Rel(api, feedSvc, "delegates") Rel(api, websiteSvc, "delegates") Rel(api, enricher, "Enqueue()") Rel(worker, feedSvc, "FeedConsumer") Rel(worker, websiteSvc, "WebsiteConsumer") Rel(worker, enricher, "Enqueue()") Rel(feedSvc, feedRepo, "CRUD") Rel(websiteSvc, websiteRepo, "CRUD") Rel(websiteSvc, kafka, "Publish magpie.v1.Resource") Rel(enricher, websiteRepo, "UpdateNameIfBlank · UpsertMetadata") Rel(enricher, minio, "PutObject") Rel(enricher, kafka, "TextExtractedEvent") Rel(feedRepo, postgres, "SQL") Rel(websiteRepo, postgres, "SQL")

Overview

Lynx is structured as a multi-binary Go monorepo. Three runnable binaries share business logic through the lib/ package tree.

graph TD Browser["Browser / CLI / External services"] UI["cmd/ui :8000\ngo-app WASM SPA"] API["cmd/api :9000\nFeedService · WebsiteService\nEnricher goroutine"] Kafka[("Kafka")] Worker["cmd/worker\nFeedConsumer · WebsiteConsumer\nEnricher pipeline goroutine"] PG[("PostgreSQL\nfeeds · websites\nwebsite_metadata")] MinIO[("MinIO / S3\nwebsites/uuid/original.html\nwebsites/uuid/readable.txt")] Browser -->|"Connect-RPC HTTP/2"| API Browser -->|"HTTP + WASM"| UI UI -->|"Connect-RPC"| API API -->|"magpie.v1.Resource"| Kafka Kafka <-->|"consume/produce"| Worker Worker -->|"runPipeline 7 steps"| PG Worker -->|"archive blobs"| MinIO API --> PG

Layer breakdown

Transport layer — Connect-RPC handlers

lib/lynx/feed/grpc/service.go and lib/lynx/website/grpc/service.go are Connect-RPC handlers. They accept *connect.Request[T] and return *connect.Response[T]. The server is wrapped in h2c to support HTTP/2 cleartext. CORS is applied per-handler.

The website gRPC handler carries optional metadataRepo and enricher dependencies set via builder methods (WithMetadataRepo, WithEnricher). Website creation applies a UUID v5 derivation from the normalized URL before delegating to the service, ensuring idempotent inserts.

Domain service layer

lib/lynx/feed/service.go and lib/lynx/website/service.go implement the FeedService and WebsiteService interfaces. The service layer delegates all persistence to a base.Repository[T] interface. websiteService.Create additionally publishes a magpie.v1.Resource event to Kafka after saving the record.

Repository layer

lib/repo/ contains three repository implementations backed by PostgreSQL via database/sql and the Squirrel query builder:

  • FeedRepo — CRUD for the feeds table.
  • WebsiteRepo — CRUD for the websites table. Has an additional UpdateNameIfBlank method for enrichment back-fill.
  • WebsiteMetadataRepo — upsert/get for the website_metadata table (FK to websites).

repo.NewDatabase runs embedded Goose migrations automatically on startup.

Enrichment pipeline

The enrichment subsystem lives in lib/lynx/enrichment/. Requests are processed asynchronously via a buffered channel. The Enricher runs a single goroutine that drains the channel and executes runPipeline for each website. Enqueue blocks when the channel is full, providing natural backpressure.

Pipeline sequence:

StepActivityOutputFatal?
1FetchWebsiteActivityFetchedWebsite (temp file path, status code, checksum)Yes
2ExtractMetadataActivityExtractedWebsiteMetadata (title, OG tags, etc.)No
3ArchiveOriginalActivityStorageResult (MinIO path)No
4ExtractReadableActivityReadableContent (temp file, word count)No
5ArchiveReadableActivityStorageResultNo
6PersistMetadataActivityYes
7SubmitToMagpieActivityNo (stub)

Storage keys follow the pattern websites/<uuid>/original.html and websites/<uuid>/readable.txt.

Kafka consumers

Both FeedConsumer and WebsiteConsumer run in goroutines that drain a channel produced by the archaea Kafka consumer. They deserialize protobuf messages and call the relevant service’s Create method. The WebsiteConsumer additionally enqueues an enrichment request for each website it creates.

CLI

A cobra command tree. A global App struct holds a gRPC client connected to localhost:9000. The form.Form[T] generic uses reflection to present a promptui prompt for each exported string/int field of the protobuf struct.

Dependency injection flow

graph LR subgraph api["cmd/api/main.go"] DB["repo.NewDatabase"] --> Conn["*repo.Conn"] KConn["kafka.NewConn"] --> KProd["*kafka.Conn"] Conn --> FeedRepo --> FeedSvc["feed.NewFeedService"] --> FeedGRPC["feedgrpc.NewFeedService"] Conn --> EnrichActs["enrichment.Activities\nMetadataRepo · WebsiteNames"] EnrichActs --> Enricher["enrichment.NewEnricher"] -->|"go Start()"| EnricherR["Enricher running"] Conn --> WebRepo["WebsiteRepo"] KProd --> WebSvc["website.NewWebsiteService"] WebRepo --> WebSvc WebSvc --> WebGRPC["websitegrpc.NewWebsiteService\n.WithMetadataRepo()\n.WithEnricher()"] end subgraph worker["cmd/worker/main.go"] WFeedSvc["feed.NewFeedService"] --> FeedCon["FeedConsumer goroutine"] WEnrichActs["enrichment.Activities\nMetadataRepo · Storage · WebsiteNames"] WEnrichActs --> WEnricher["enrichment.NewEnricher"] -->|"go Start()"| WEnricherR["Enricher running"] WWebSvc["website.NewWebsiteService"] --> WebCon["WebsiteConsumer\nkafka · svc · enricher"] end

Environment Variables

API Server (cmd/api)

VariableDescription
DATABASE_URLPostgreSQL connection string
KAFKA_BROKERSKafka broker address

Worker (cmd/worker)

VariableDescription
DATABASE_URLPostgreSQL connection string
KAFKA_BROKERSKafka broker address
MINIO_ENDPOINTMinIO/S3 endpoint; archiving skipped if unset
MINIO_ACCESS_KEYMinIO access key
MINIO_SECRET_KEYMinIO secret key
MINIO_BUCKETBucket name (default: lynx)