RAG / LLM Application Architecture

Free template — view it below, open it in draw.io, or customize it with AI in seconds.

Customize with AI — free Open in draw.io

The prompt behind this diagram

A retrieval-augmented generation (RAG) application architecture: user query goes to a web app, backend embeds the query with an embedding model, searches a vector database (with documents previously chunked, embedded and indexed from a document store), retrieved context plus query sent to an LLM API, response streamed back to the user. Include an ingestion pipeline branch showing document upload, chunking, embedding and indexing.

Paste your own description (or Terraform / docker-compose / SQL schema) into draft1 and get a diagram like this for your exact system.

More templates

AWS 3-Tier Web Architecture

A production-ready AWS 3-tier architecture template: Route53, CloudFront, ALB, auto-scaling EC2, RDS multi-AZ and Elasti

Kubernetes Cluster Architecture

A complete Kubernetes cluster diagram template: control plane, worker nodes, ingress, service mesh sidecars and persiste

Microservices E-commerce Architecture

A microservices e-commerce architecture template: API gateway, seven services, Kafka event bus and database-per-service

E-commerce Database ER Diagram

An e-commerce database ER diagram template: users, products, orders, payments and reviews with keys and 1-N relationship

CI/CD Pipeline Architecture

A CI/CD pipeline diagram template: GitHub Actions stages, Docker registry, staging and production Kubernetes deploys wit

Serverless AWS Architecture

A serverless AWS architecture template: API Gateway, Lambda, DynamoDB, S3, Cognito, SQS workers and EventBridge schedule

Event-Driven Architecture with Kafka

An event-driven architecture template built on Kafka: producers, three-broker cluster, consumer groups, Kafka Streams an