AI Agent Glossary

44 terms explained in plain English

Every concept you'll encounter when building with AI agents, MCP servers, and LLM frameworks — from agentic loops to zero-shot prompting.

A C E F G H I J K L M O P R S T V W Z

A

AI Agent: An autonomous system that uses an LLM to reason about tasks, decide which tools to call, and iteratively execute steps until a goal is achieved. Unlike simple chatbots, agents operate in a loop: reason → act → observe → repeat.; /agents →/learn/what-are-ai-agents →
Agentic Loop: The core execution cycle of an AI agent: the LLM generates a plan or tool call, the tool executes, the result is fed back to the LLM, and the process repeats. Also called the ReAct loop (Reason + Act).; /learn/how-to-build-ai-agent →

C

Chain of Thought (CoT): A prompting technique where the model is encouraged to reason step-by-step before answering. Improves accuracy on complex tasks. Extended thinking in Claude implements this automatically.
Claude Code: Anthropic's terminal-first AI coding agent. Operates as a CLI that reads files, writes code, runs tests, and manages git — all autonomously. Uses the MCP protocol for tool integration.; /agents/cline →/learn/claude-code-vs-cursor-vs-copilot →
Context Window: The maximum number of tokens an LLM can process in a single request. Claude's context window is 200K tokens (1M with extended). Larger windows let agents maintain more history but increase cost.
Coordinator Pattern: A multi-agent design where a manager agent delegates subtasks to specialist agents, then synthesizes their results. Common in complex workflows like code review (reviewer + security + performance agents).; /workflows →/learn/agent-design-patterns →
Cursor: An AI-native IDE (VS Code fork) with inline completions, chat, and agent mode. Supports multiple LLMs and MCP server integration for tool use.; /learn/claude-code-vs-cursor-vs-copilot →

E

Embedding: A numerical vector representation of text, used for semantic search and similarity matching. Models like OpenAI's text-embedding-3 or Cohere's embed turn text into vectors that capture meaning.
Extended Thinking: A Claude feature where the model performs explicit chain-of-thought reasoning in a thinking block before responding. Improves performance on complex coding, math, and analysis tasks.

F

Fine-tuning: Training a pre-trained LLM on domain-specific data to improve performance on targeted tasks. Less common for agents (prompt engineering and tool use are usually sufficient) but useful for specialized classifiers.
Function Calling: The ability of an LLM to output structured tool/function invocations. The model doesn't execute the function — it produces a JSON spec that the host application executes. Foundation of all agent tool use.

G

Grounding: Connecting LLM responses to real data sources (search results, databases, APIs) to reduce hallucinations. RAG is the most common grounding technique.
Guardrails: Safety mechanisms that constrain agent behavior: input/output validation, confirmation for destructive actions, iteration limits, and content filtering. Essential for production agents.

H

Hallucination: When an LLM generates plausible-sounding but factually incorrect information. Agents mitigate this through tool use (grounding in real data) and verification loops.
Hooks: In Claude Code, shell commands that execute automatically in response to events like tool calls or file edits. Used for custom validation, linting, or notification workflows.

I

In-Context Learning: An LLM's ability to learn from examples provided in the prompt (few-shot learning) without updating model weights. Agents use this to adapt to project-specific conventions via CLAUDE.md files.

J

JSON-LD: A structured data format for search engine optimization. Used in agent directory pages to provide Google with machine-readable information about tools, comparisons, and articles.

K

Knowledge Graph: A structured representation of entities and their relationships. Some agent memory systems use knowledge graphs instead of vector stores for more precise retrieval.

L

LangChain: A popular Python/JS framework for building LLM applications. Provides abstractions for chains, agents, memory, and tool use. LangGraph extends it with graph-based agent workflows.; /frameworks/langchain →
LLM (Large Language Model): A neural network trained on massive text corpora to predict and generate text. Models like Claude, GPT, and Gemini power AI agents by providing reasoning, planning, and code generation capabilities.
LoRA: Low-Rank Adaptation — an efficient fine-tuning technique that adds small trainable matrices to frozen model layers. Dramatically reduces the compute needed for fine-tuning.

M

MCP (Model Context Protocol): An open protocol (by Anthropic) for connecting AI assistants to external tools and data sources. MCP servers expose capabilities (tools, resources, prompts) that any MCP client can discover and use.; /mcp-servers →/learn/understanding-mcp →/learn/mcp-server-setup-guide →
MCP Client: An application that connects to MCP servers and uses their tools. Claude Code, Cursor, and Windsurf are MCP clients. The client discovers available tools and routes LLM tool calls to the appropriate server.
MCP Server: A process that exposes tools, resources, or prompts via the Model Context Protocol. Examples: Playwright MCP (browser automation), GitHub MCP (repository management), Filesystem MCP (file operations).; /mcp-servers →
MCP Transport: The communication channel between an MCP client and server. stdio transport uses standard input/output (local processes). SSE (Server-Sent Events) enables remote connections over HTTP.
Multi-Agent System: A system where multiple specialized agents collaborate on a task. Common patterns include coordinator (manager + workers), pipeline (sequential handoff), and debate (adversarial critique).; /learn/multi-agent-systems →

O

Observability: The practice of tracking and debugging AI agent behavior through logging, tracing, and analytics. Tools like Langfuse, LangSmith, and Helicone provide LLM-specific observability.; /dev-tools →/learn/agent-observability →

P

Prompt Engineering: The practice of designing and optimizing prompts to get better results from LLMs. Techniques include role prompting, few-shot examples, chain-of-thought, and structured output formatting.; /prompts →/learn/prompt-engineering-for-agents →

R

RAG (Retrieval-Augmented Generation): A pattern that grounds LLM responses in retrieved documents. The system searches a knowledge base, injects relevant chunks into the prompt, and the LLM generates a response informed by that context.; /learn/rag-patterns-guide →
ReAct Pattern: Reason + Act — the foundational agent architecture. The LLM alternates between reasoning (thinking about what to do) and acting (calling tools), using observations to inform the next step.; /learn/how-to-build-ai-agent →
Retrieval: The process of finding relevant information from a knowledge base to provide context to an LLM. Retrieval can be keyword-based (BM25), semantic (vector search), or hybrid.

S

Scaffold: The surrounding code and infrastructure that turns an LLM into an agent: the loop, tool execution, memory management, error handling, and output parsing.
Skill (Slash Command): In Claude Code, a pre-defined command invoked with /name that expands into a specialized prompt or workflow. Built-in skills include /compact, /init, and /doctor. Custom skills can be created.; /skills →
SSE (Server-Sent Events): A web protocol for streaming data from server to client over HTTP. Used as an MCP transport for remote servers and for streaming LLM responses token-by-token.
Structured Output: Constraining LLM output to a specific format (JSON schema, XML, etc.). Critical for agents because tool calls must follow exact schemas. Claude's tool_use feature provides this natively.
SWE Agent: Software Engineering Agent — an AI system designed specifically for coding tasks: bug fixing, feature implementation, code review, and test writing. Examples: SWE-Agent, OpenHands, Devin.; /agents →
System Prompt: Instructions provided to an LLM at the start of a conversation that define its behavior, capabilities, and constraints. In Claude Code, CLAUDE.md files serve as project-specific system prompts.

T

Temperature: A parameter that controls LLM output randomness. Lower values (0-0.3) produce more deterministic outputs; higher values (0.7-1.0) increase creativity. Agents typically use low temperature for tool calls.
Token: The basic unit of text processing for LLMs. Roughly 1 token = 3/4 of a word in English. Token count determines cost and context window usage. Claude: ~$3/M input tokens, ~$15/M output tokens (Opus).
Tool Use: An LLM's ability to interact with external systems through structured function calls. The model decides which tool to call and with what arguments; the host application executes it and returns results.; /mcp-servers →

V

Vector Database: A database optimized for storing and querying high-dimensional vectors (embeddings). Used in RAG systems and agent memory. Examples: Chroma, Pinecone, Weaviate, Qdrant.

W

Workflow: A defined sequence of agent actions for a specific task. Unlike free-form agent execution, workflows follow a predetermined structure (e.g., research → draft → review → publish).; /workflows →
Worktree: A git feature that creates multiple working directories from the same repository. Used in Claude Code to run parallel agent tasks on isolated branches without conflicts.

Z

Zero-Shot: Performing a task with no examples in the prompt — relying entirely on the model's pre-trained knowledge. Contrasts with few-shot (providing examples) and fine-tuned (trained on task-specific data).