RAG Subsystem (Retrieval-Augmented Generation)
The RAG subsystem provides a unified interface for managing knowledge sources and creating high-performance retrieval APIs. It creates an abstraction layer over various retrieval backends, allowing you to switch between external services and local vector stores without changing your application code.
🏗 Architecture
The system is built on a Provider Pattern:
- Knowledge Sources: Define where data lives (e.g., External Service, Milvus, Postgres).
- RAG Endpoints: Define how to query it. Each endpoint has a unique slug and access control policy.
- Retrieval Providers: Handle the execution logic (HTTP proxy, Vector Search, etc.).
🚀 Quickstart
1. External Gateway (Proxy)
Use this mode to securely proxy queries to an existing RAG microservice.
-
Create Knowledge Source:
- Type:
External RAG Service - Config:
{"base_url": "https://my-rag-service.com"} - Credentials:
env://MY_SERVICE_API_KEY
- Type:
-
Create RAG Endpoint:
- Slug:
my-service-proxy - Source: Link to the source created above.
- Slug:
-
Query:
bash POST /api/rag/my-service-proxy/query { "query": "hello", "top_k": 3 }
2. Local Indexing (Native)
Use this mode to run RAG entirely within Django using Milvus or PGVector.
Step A: Configure Embedding Model
Go to Admin > RAG > Embedding Models:
* Name: OpenAI Ada 002
* Key: openai-ada-002
* Provider: OpenAI
* Config: {"model_name": "text-embedding-ada-002"}
* Credentials: env://OPENAI_API_KEY
Step B: Create Knowledge Source
Go to Admin > RAG > Knowledge Sources:
* Type: Local Index (Milvus/PGVector)
* Config (Milvus):
json
{
"vector_store": "milvus",
"milvus_uri": "http://localhost:19530",
"collection_name": "my_docs",
"embedding_model": "openai-ada-002"
}
OR
Config (PGVector):
json
{
"vector_store": "pgvector",
"table_name": "my_vector_table",
"embedding_model": "openai-ada-002"
}
Step C: Query
Create an endpoint pointing to this source and query it normally. The system automatically: 1. Embeds the query using the configured model. 2. Searches the vector store. 3. Returns normalized results.
🛡 Security
SSRF Protection
All external requests (Webhooks, RAG Proxy) go through a hardened HTTP client that:
* Blocks Private IPs: Prevents access to localhost, 127.0.0.1, 169.254.x.x, etc.
* Disables Redirects: Prevents redirect-based SSRF attacks.
* Enforces Timeouts: Defaults to 30s.
* Limits Response Size: Prevents DoS via large payloads.
Secret Management
Never store raw secrets in configuration. Use SecretRef:
* env://VAR_NAME: Read from environment variable.
* db://secret_name: Read from SecretStore model.
Access Control (RBAC)
RAG Endpoints support granular access policies:
{
"require_authenticated": true,
"allowed_groups": ["rag_users", "staff"],
"allowed_users": ["admin@example.com"],
"denied_users": ["abuser@example.com"]
}
📊 Observability
Query Logs
Every query is logged to RAGQueryLog with:
* Trace ID: Unique correlation ID for the request.
* Latency: End-to-end execution time.
* Results Metadata: Source IDs and scores (content is not logged by default for privacy).
* Policy Decisions: Why a request was allowed or denied.
Test Query UI
Admin staff can test connections directly from the dashboard: 1. Go to RAG Endpoints. 2. Click an endpoint. 3. Use the "Test Query" tab to run live searches and view raw JSON responses.