HOME/TECHNIQUE/Retrieval & Grounding/Basic RAG

TECHNIQUE

Basic RAG

Retrieval & Grounding

10APPLICATIONS

13OBSERVED OPERATORS

State of Practice

CROSS-VALIDATED — 6 OPERATORS

Across directly evidenced deployments, Basic RAG is a production pattern for adding source context to LLM systems, with operators differing mainly on corpus type, retrieval controls, and evaluation depth.

Observed Practices

Use retrieval-grounded context instead of relying on the LLM alone: operators retrieve source material, pass retrieved context to an LLM or query engine, and generate an answer, ranking decision, or classification from that context.

6 of 6 operators with direct Basic RAG evidence in this pool.

AgodaDropboxGrabLinkedInRipplingUber

Ground against enterprise or product-specific corpora: internal wikis/docs, uploaded files, help-center and policy documents, historical incidents, job-posting attributes, or other domain records are connected as retrieval sources.

5 of 6 operators with direct Basic RAG evidence in this pool.

AgodaGrabLinkedInRipplingUber

Index retrieval material in vector infrastructure when the operator explicitly describes storage: past incidents, document chunks, or mined attributes are stored in a vector database or vector store for retrieval.

3 of 6 operators with direct Basic RAG evidence in this pool.

AgodaLinkedInUber

Add query understanding or source narrowing before retrieval: operators classify intent, enrich queries with external/profile data, identify the relevant domain or document subset, or optimize ambiguous queries before sending context to the LLM.

3 of 6 operators with direct Basic RAG evidence in this pool.

LinkedInRipplingUber

Use ranking, re-ranking, hybrid retrieval, or post-processing to control what context reaches the model.

4 of 6 operators with direct Basic RAG evidence in this pool.

DropboxLinkedInRipplingUber

Expose RAG-backed capabilities through user-facing workflows rather than standalone demos: Slack bots, browser apps, chat interfaces, search experiences, or internal product surfaces.

5 of 6 operators with direct Basic RAG evidence in this pool.

DropboxGrabLinkedInRipplingUber

Put evaluation, tracing, or review around RAG outputs when reliability is central to the use case.

3 of 6 operators with direct Basic RAG evidence in this pool.

AgodaDropboxRippling

Where Operators Converge

Every directly evidenced operator uses retrieval to supply contextual source material to an LLM-centered system before producing the final user-facing or workflow-facing output.

Where Operators Diverge

Operators differ on what the RAG corpus represents.

APPROACH 01

Internal knowledge and work documents: wikis, Google Docs, uploaded files, help-center docs, handbooks, policy documents, and internal documentation are used as retrieval sources.

GrabRipplingUber

APPROACH 02

Product/search attributes: mined job-posting attributes are stored in a vector database and passed to the query engine LLM via RAG.

APPROACH 03

Historical operational cases: past incidents and root-cause analyses are queried to support alert triage.

Agoda

Operators differ on retrieval control: some describe a relatively direct RAG/vector pattern, while others add explicit query planning, domain scoping, hybrid retrieval, ranking, or re-ranking.

APPROACH 01

Direct RAG/vector retrieval is the described mechanism.

AgodaGrabLinkedIn

APPROACH 02

Controlled retrieval adds query optimization, source identification, BM25 alongside vector search, post-processing, semantic-layer domain scoping, or aggressive re-ranking.

RipplingUber

Operators differ on reliability controls around RAG outputs.

APPROACH 01

Automated regression, staging, production sampling, tracing, evaluations, and monitoring are described as part of operating the system.

DropboxRippling

APPROACH 02

Human review or escalation remains in the workflow when context is insufficient or publication is needed.

Agoda

Watch Items

Answer quality and hallucination remain operational risks: Dropbox reports that changes to retrieval, ranking, prompts, model inference, or safety filtering can turn a prior good answer into a hallucination; Uber frames accuracy, relevance, and avoiding misinformation as a continuing challenge.

Missing, ambiguous, or over-broad context needs special handling: Uber adds query optimization and source identification for ambiguous/lacking-context queries, while Agoda says cases without full context should over-escalate to human review.

Context volume and retrieval precision become bottlenecks at scale: Rippling uses re-rankers to reduce context size by 100 to 500x, Uber narrows retrieval to an identified document set and post-processes retrieved chunks, and LinkedIn says raw query embeddings alone are not enough for job retrieval.

Implementation Menu

CURATED DEFAULTS

Name	Kind	When	Maturity
pgvector	library	embeddings should live beside existing Postgres data with one operational store	commodity
Qdrant	service	dedicated vector search with payload filtering at scale	established
LlamaIndex	library	fastest path to ingestion plus retrieval scaffolding in one framework	established

Observed in Production

10 APPS

TechnologyGROUNDED

Basic RAG

State of Practice

Observed Practices

Where Operators Converge

Where Operators Diverge

Watch Items

Implementation Menu

Observed in Production

LLM Application Quality Assurance

LLM-Assisted Code Review, Test Migration, and Agent Evaluation

Enterprise Search Synthetic Evaluation Data Generation

Security and Privacy Policy On-Call Support Copilot

AI Security Decision Audit and Incident Report Generation

AI-Assisted Product and Developer Collaboration Workflows

Code and Query Defect Validation and Repair

LLM SQL and Knowledge Base Quality Evaluation

Permission-Aware Enterprise AI Search and PII Tagging

Sales Rep Meeting Prep and Curate Help Prompt Library