RAG / LLM Application Architecture

Free template — view it below, open it in draw.io, or customize it with AI in seconds.

Customize with AI — free Open in draw.io

The prompt behind this diagram

A retrieval-augmented generation (RAG) application architecture: user query goes to a web app, backend embeds the query with an embedding model, searches a vector database (with documents previously chunked, embedded and indexed from a document store), retrieved context plus query sent to an LLM API, response streamed back to the user. Include an ingestion pipeline branch showing document upload, chunking, embedding and indexing.

Paste your own description (or Terraform / docker-compose / SQL schema) into draft1 and get a diagram like this for your exact system.

RAG / LLM Application Architecture

The prompt behind this diagram

More templates

AWS 3-Tier Web Architecture

Kubernetes Cluster Architecture

Microservices E-commerce Architecture

E-commerce Database ER Diagram

CI/CD Pipeline Architecture

Serverless AWS Architecture

Event-Driven Architecture with Kafka