AI Knowledge Base 2026

AI Glossary 2026

Clear definitions for the era of Agentic AI and Spatial Intelligence.

AI Safety & Guardrails

Red Teaming (AI Security Testing)

Red teaming is a structured adversarial testing method where a team of security experts deliberately attempts to expose vulnerabilities, failure modes, or harmful behaviors in an AI system — mirroring the approach of a real attacker. The term originates from military planning, where a red team would simulate enemy forces to stress-test defenses. In the AI context, red teaming involves systematic attempts to manipulate a model through adversarial prompts, jailbreaks, and edge-case inputs — trying to coax the system into producing harmful content, leaking sensitive information, or bypassing safety guardrails. These tests typically occur before public deployment as part of a safety evaluation lifecycle. Leading AI labs like Anthropic, OpenAI, and Google DeepMind publish red teaming findings as part of their model cards and system cards. Regulatory frameworks including the EU AI Act now recommend adversarial testing for high-risk AI deployments.

Explore Concept
AI Safety & Guardrails

Responsible Scaling Policy (RSP)

A Responsible Scaling Policy (RSP) is a formal internal framework that defines the conditions under which an AI lab may continue developing and deploying increasingly powerful models. Pioneered by Anthropic, the RSP establishes AI Safety Levels (ASL) — escalating capability tiers, each with mandatory safety requirements that must be demonstrably met before development continues. ASL-3 models require strict deployment controls; ASL-4 models may be withheld from release entirely if safety conditions cannot be satisfied. Claude Mythos Preview is a real-world example: reportedly withheld under these provisions after it autonomously discovered zero-day vulnerabilities across major operating systems. The RSP links technical research (interpretability, red-teaming, automated evaluations) with operational governance. Other leading labs — Google DeepMind, OpenAI — have developed analogous frameworks, but Anthropic is widely credited as the pioneer of the publicly documented RSP approach. For enterprises procuring AI services, a vendor's RSP is a meaningful transparency signal: it reveals how the lab handles its most capable and potentially dangerous models, and under what thresholds it will refuse to ship.

Explore Concept
Agentic Infrastructure

Real-Time Inference

Real-time inference is the immediate processing of AI requests with minimal latency, typically in the range of milliseconds to a few seconds. Unlike batch inference where requests are collected and processed in groups, real-time inference responds to each input immediately — critical for interactive applications where users expect instant feedback. The most important metric is Time-to-First-Token (TTFT): elapsed time between submitting a request and receiving the first response token. For conversational chatbots, TTFT under 500ms is generally acceptable; for coding assistants, sub-200ms targets are pursued. Streaming output (token by token) dramatically improves perceived latency even when total response time remains constant. Typical real-time inference use cases: conversational chatbots like ChatGPT or Claude.ai, AI coding assistants like GitHub Copilot or Cursor, real-time translation services, voice assistants combining speech recognition and synthesis, interactive document analysis, and autonomous AI agents that must react to environmental changes within tight time windows. Technical requirements are significantly more demanding than batch inference: low latency requires geographically proximate servers (edge inference), specialized low-latency optimizations like KV-cache preloading and speculative decoding, or the use of smaller, faster models. Providers like Groq (LPU chip) and Cerebras achieve 500+ TPS purpose-built for real-time applications. The fundamental tradeoff: latency, throughput, and cost per token.

Explore Concept