Side-by-side analysis of related patterns. Each comparison covers the key trade-offs, when to use each pattern, and how they work together.
Compare Basic RAG and Agentic RAG retrieval patterns. Learn when a fixed pipeline is enough and when agent-controlled retrieval is worth the complexity.
Start with Basic RAG for predictable single-source queries; graduate to Agentic RAG when users need multi-step, multi-source reasoning.
Compare Basic RAG and Deep Search retrieval patterns. Learn when a single retrieval step is enough and when you need iterative multi-hop research.
Use Basic RAG for straightforward factual questions where one retrieval pass is sufficient; use Deep Search for complex research questions that require synthesizing information across multiple sources.
Understand how Basic RAG and Semantic Indexing relate. One is the full pipeline, the other upgrades its retrieval layer with vector search.
This is not an either/or choice. Semantic Indexing is an upgrade to the retrieval step inside Basic RAG. Start with keyword-based RAG to validate your pipeline, then add Semantic Indexing when keyword matching falls short.
Compare Chain-of-Thought single-path reasoning with Self-Consistency multi-path voting to decide which prompting strategy fits your accuracy and cost needs.
Start with Chain-of-Thought for most tasks; add Self-Consistency when you need higher reliability on questions with a single correct answer and can afford the extra API calls.
Compare Code Execution and ReAct Loop patterns for AI agents. Learn when to use a sandbox runtime vs a reasoning-action orchestration loop.
Code Execution is a tool for computational precision; ReAct is an orchestration pattern for multi-step reasoning. Most capable agents use ReAct as the loop with Code Execution as one of its tools.
Compare Conversation Memory and Long-Term Memory for LLM apps. One manages context within a session, the other persists across sessions.
Use Conversation Memory to manage context within a single session, and Long-Term Memory to persist user facts and preferences across sessions. Most production chatbots need both.
Compare Deep Search and Agentic RAG patterns. One follows a structured multi-hop cycle, the other gives the agent freedom to choose retrieval strategy.
Use Deep Search for research-style tasks where thoroughness and auditability matter, and Agentic RAG when the retrieval strategy itself is uncertain and the agent needs flexibility to adapt.
Compare Few-Shot Prompting and Chain-of-Thought. Learn when to use examples for format consistency versus step-by-step reasoning for logic tasks.
Use Few-Shot for output format and style consistency; use Chain-of-Thought for reasoning-heavy tasks like math, logic, and multi-step analysis.
Compare Guardrails and Grounded Generation for building safe, reliable LLM apps. One blocks harmful content, the other prevents hallucinations.
Use Guardrails to enforce safety policies and block harmful content. Use Grounded Generation to ground outputs in evidence and prevent hallucinations. Production systems need both.
Compare Guardrails and Self-Check safety patterns for LLM applications. Learn when to use external policy filters versus internal confidence analysis.
Use Guardrails to enforce policy at system boundaries; use Self-Check to catch hallucinations in generated output; deploy both for production safety.
Hybrid Retrieval improves what you find. Retrieval Refinement improves what you keep. Learn how both fix different RAG quality problems.
Most production RAG pipelines need both: Hybrid Retrieval to find the right documents, then Retrieval Refinement to filter and rank what you found.
Compare LLM-as-Judge and Reflection patterns. One scores outputs post-hoc, the other improves them iteratively. Learn which to use.
Use LLM-as-Judge when you need to evaluate or rank outputs at scale, and Reflection when you want the model to iteratively improve its own output before returning it.
Compare Model Router and Cascading patterns for LLM cost optimization. Learn when to classify-then-route versus try-cheap-first-then-escalate.
Use a Model Router when query complexity is predictable from the input; use Cascading when you cannot judge difficulty until you see the output quality.
Compare Multi-Agent Collaboration and Plan-and-Execute patterns for complex LLM tasks. One uses specialized agents, the other uses a single planner and executor.
Start with Plan-and-Execute for structured tasks with clear steps. Move to Multi-Agent Collaboration when your task genuinely requires different expertise or parallel workstreams that a single executor cannot handle.
Compare Prompt Caching and Small Language Models as cost reduction strategies. One eliminates redundant computation, the other makes each computation cheaper.
Use Prompt Caching when your queries are repetitive and you need frontier model capability. Use Small Language Models when the task is well-defined and does not require the largest models. Combine both for maximum cost savings.
Prompt Chaining follows a fixed sequence of steps. ReAct Loop decides its next action dynamically. Compare both orchestration patterns.
Use Prompt Chaining for well-understood workflows with predictable steps; use ReAct Loop when the path to the answer depends on what the model discovers along the way.
Compare Prompt Optimization and Few-Shot Prompting for LLM systems. Learn when automated optimization beats hand-picked examples.
Start with Few-Shot Prompting to get a working baseline quickly; graduate to Prompt Optimization when you have evaluation data and need systematic, reproducible improvements.
Compare ReAct Loop and Plan-and-Execute agent patterns. Learn when reactive step-by-step reasoning beats strategic upfront planning.
Use ReAct for exploratory tasks with unpredictable steps; use Plan-and-Execute when the goal is clear and the steps can be mapped upfront.
Compare Self-Check and LLM-as-Judge for evaluating LLM outputs. Learn when to use internal confidence signals vs external rubric-based scoring.
Use Self-Check for real-time per-response gating where speed matters; use LLM-as-Judge for thorough evaluation in offline pipelines or when you need rubric-based scoring.
Self-Consistency samples multiple answers in parallel and votes. Reflection generates, critiques, and regenerates iteratively. Compare both patterns.
Use Self-Consistency when you need fast, reliable answers to questions with clear correct outputs; use Reflection when quality comes from iterative refinement of complex, open-ended tasks.
Compare Semantic Indexing and Hybrid Retrieval. Learn when to invest in better embeddings vs smarter query strategies for RAG.
Semantic Indexing sets your retrieval ceiling; Hybrid Retrieval helps you reach it. Invest in indexing first, then layer query optimization for hard questions.
Semantic Router classifies intent to pick a handler. Model Router classifies complexity to pick an LLM tier. Learn when to use each.
Use Semantic Router to decide what pipeline handles a request, then Model Router within that pipeline to decide which model runs it.
Compare Small Language Models and Inference Optimization for cutting LLM costs. One shrinks the model, the other speeds up how you run it.
Use Small Language Models to permanently reduce per-request cost, and Inference Optimization to maximize throughput of whatever model you run. They compound when combined.
Compare Tool Calling and Code Execution patterns for LLM agents. One uses predefined functions, the other runs arbitrary code in a sandbox.
Use Tool Calling for structured API interactions with predictable behavior, and Code Execution when the model needs to perform arbitrary computation like data analysis or math.