Skip to content

RAG Service (retired W2-7, 2026-04-30)

The standalone services/rag-service was folded into services/ai-service in W2-7 (ADR-002 Wave 2). The RAG pipeline — document ingestion, vector storage, semantic search, RAG completion — now runs in-process inside ai-service.

Where things live now

Pre-W2-7 Post-W2-7
services/rag-service/app/services/vector_service.py services/ai-service/app/services/vector_service.py
services/rag-service/app/services/docling_service.py services/ai-service/app/services/docling_service.py
services/rag-service/app/services/llm_service.py (raw openai SDK) services/ai-service/app/services/ai_tools.py (get_embeddings, chat_with_context via langchain_openai)
services/rag-service/app/api/endpoints/{search,chat,convert}.py services/ai-service/app/api/endpoints/rag_ops.py (mounted at /api/v1/{search,chat,convert})
services/rag-service/app/services/rabbitmq_consumer.py (rag_service_queue) services/ai-service/app/services/rabbitmq_consumer.py (ai_service_rag_documents queue, same document.uploaded + document.deleted bindings)
rag.* MCP tools (httpx-proxied to rag-service) rag.* MCP tools (in-process function calls in services/ai-service/app/services/mcp_server.py)
rag-service-{dev,qual,prod} containers (removed)
RAG_SERVICE_URL env var (consumers) AI_SERVICE_URL env var

What was deleted

  • app/services/llm_service.py (54 LoC) — duplicated ai-service's LangChain ChatOpenAI; ai-service's ai_tools.py covers chat completions and now embeddings via langchain_openai.OpenAIEmbeddings.
  • app/api/endpoints/monitor.py (26 LoC) — broken Qdrant artifact (referenced vector_service.client.get_collection which hasn't existed since W1-4).

What was preserved

  • rag_qual / rag_prod Postgres databases (single Postgres cluster — ai-service connects via the same DSN shape).
  • docling-serve sidecar (per env: docling-serve / docling-serve-qual / docling-serve-prod).
  • The 9-tool MCP contract (6 ai.* + 3 rag.*) — external agents see no change.
  • partner.events publishes from partner-service (document.uploaded + document.deleted) — same exchange, same routing keys.

See also

  • ai-service README — current home of the RAG pipeline
  • AI & RAG Architecture — high-level diagram, post-W2-7
  • ADR-002 §W2-7 — consolidation log
  • ADR-004 — original two-Python-service decision, superseded in part by W2-7