AI Alignment
The problem of making AI systems pursue intended goals, values, and constraints without harmful side effects, reward hacking, or deceptive compliance.
A structured reference for concepts, philosophies, organizations, individual players, and recurring patterns in the AI transition. Blog posts argue; the wiki defines, maps, and keeps track.
The wiki is intentionally broad. Use these lanes when the four top-level categories are too coarse.
The problem of making AI systems pursue intended goals, values, and constraints without harmful side effects, reward hacking, or deceptive compliance.
The problem of aligning, supervising, and validating AI systems that may become more capable than the humans trying to oversee them.
The alignment problem of extracting what an AI system internally knows about the world when its outputs, sensors, or incentives may be untrusted.
OpenAI's mass-market AI assistant and platform layer for chat, writing, coding, memory, search-like answers, tools, and agentic action.
Anthropic's AI assistant, model family, and product platform for chat, coding, agents, computer use, enterprise workflows, and safety-oriented frontier AI.
Google's multimodal frontier model family and assistant platform across Search, Android, Workspace, Cloud, developer tools, and agentic products.
Benchmarks, red teaming, dangerous-capability tests, autonomy evals, and post-deployment monitoring used to judge AI claims and risks.
The crowdsourced AI model evaluation platform that ranks systems through anonymous pairwise human preference votes and public leaderboards.
The use of language models to evaluate, score, compare, rank, or critique other model outputs in automated AI evaluation pipelines.
Stanford CRFM's Holistic Evaluation of Language Models framework for transparent, standardized, multi-metric evaluation of foundation models.
The Abstraction and Reasoning Corpus benchmark family for testing abstraction, few-shot reasoning, skill-acquisition efficiency, and interactive agentic generalization.
The Massive Multitask Language Understanding benchmark, a 57-subject test suite that became a central public scoreboard for large language models.
The Graduate-Level Google-Proof Q&A benchmark for expert-written biology, physics, and chemistry questions that test hard scientific reasoning beyond simple web search.
The expert-level multimodal benchmark for testing frontier AI systems on hard closed-ended academic questions across many fields of human knowledge.
AIME, MATH, FrontierMath, and related mathematical reasoning tests used to evaluate frontier AI systems on precise multi-step problem solving.
The large-scale image database and benchmark ecosystem that helped make computer vision, deep learning, and public AI progress measurable.
The software-engineering benchmark family that evaluates whether AI systems can resolve real GitHub issues by editing code in existing repositories.
OpenAI's code-generation benchmark for testing whether language models can synthesize short Python functions from docstrings and pass executable unit tests.
Methods for estimating future AI capabilities, timelines, bottlenecks, and discontinuities from compute trends, benchmarks, expert judgment, and scenarios.
AI systems that accelerate the research, engineering, evaluation, and infrastructure work used to build more capable AI systems.
The governance, evaluation, and safeguard field concerned with AI systems that can accelerate useful biology while also lowering barriers to biological misuse.
Company-side policies that set risk categories, capability thresholds, evaluations, safeguards, and release gates for advanced AI systems.
The machine-learning paradigm where agents learn through action, feedback, reward, exploration, and delayed consequences.
Google DeepMind's Go-playing AI system and public breakthrough for neural-network-guided search, self-play, reinforcement learning, and machine-discovered strategy.
Google DeepMind's general self-play reinforcement-learning and search system for mastering chess, shogi, and Go from rules alone.
Google DeepMind's model-based reinforcement-learning system that plans with a learned model instead of being given the rules of its environment.
The preference-training method that helped turn base language models into assistant-like systems, while creating new risks around sycophancy and reward proxies.
The RL-free preference-training method that aligns models from chosen/rejected examples without a separately trained reward model or PPO loop.
The DeepSeek-originated reinforcement-learning method that compares groups of sampled answers to train reasoning behavior without a separate value model.
The post-training paradigm that uses automatically checkable outcomes, rather than human preference reward models, to train reasoning behavior.
The large-scale training stage that gives modern AI systems broad representations, latent capabilities, and reusable base-model behavior.
The supervised, preference, reinforcement, safety, reasoning, and adaptation stages that turn pretrained models into useful deployed AI systems.
The parameter-efficient fine-tuning method that adapts large models through small trainable low-rank adapter weights while leaving the base model mostly frozen.
The alignment method that trains AI systems against explicit principles through critique, revision, and reinforcement learning from AI feedback.
Model-driven systems that pursue goals through tools, state, plans, and delegated action, moving AI from answers into operations.
The interface layer that lets AI models request external actions, retrieve live data, execute functions, and participate in agent workflows.
The schema and grammar layer that makes AI outputs parseable, validatable, and usable by software, tool calls, agents, evaluators, and workflows.
The high-priority instruction layer that shapes AI assistant roles, behavior, authority hierarchy, tool use, safety boundaries, and prompt-governance risk.
The open protocol for connecting AI systems to external tools, data sources, prompts, and context through MCP clients and servers.
The open standard for communication, discovery, task management, and collaboration between independent AI agents.
The Reasoning and Acting pattern where AI agents interleave reasoning traces, tool actions, and observations to plan, act, and update their next step.
The declarative framework for programming language-model systems with signatures, modules, metrics, and optimizers instead of hand-maintained prompt strings.
The chips, data centers, cloud access, training runs, and inference capacity that make large-scale AI systems physically and economically possible.
The policy layer that uses AI compute, cloud clusters, chips, data centers, thresholds, and public allocation as levers for safety, access, and control.
The source material used to shape AI systems before deployment, including scraped, licensed, public, human-labeled, user-derived, and synthetic data.
The influential critique of large language models as fluent statistical text systems whose apparent understanding can hide data, labor, power, bias, and environmental costs.
Downloadable model weights that can be run, modified, fine-tuned, redistributed, or embedded outside the original provider's hosted service.
Meta's open-weight model family and developer ecosystem, spanning research releases, commercial model weights, multimodal models, safety tools, and licensing debates.
Alibaba Cloud's open foundation model family spanning language, coding, math, vision-language, audio, reasoning, long context, and agent-oriented releases.
The runtime computation used when AI systems answer, reason, search, verify, call tools, or iterate through agent loops.
The rebound pattern where cheaper, more efficient AI computation can increase total demand for compute, electricity, data centers, and automated workflows.
AI systems trained or configured to spend extra computation on intermediate reasoning before answering, especially for math, code, science, planning, and analysis.
The prompting method that elicits intermediate reasoning steps from language models, improving many multi-step tasks while complicating trust and oversight.
The ability of language models to adapt from examples, instructions, demonstrations, and patterns supplied in the prompt without updating model weights.
Reality shaped by systems that observe, model, predict, and feed their outputs back into the world they describe.
Chatbot systems designed or used for friendship, romance, emotional support, roleplay, mentorship, or persistent synthetic relationship.
A non-diagnostic public term for destabilizing belief loops that can form around persuasive, sycophantic, or spiritually interpreted AI interaction.
The attempt to reverse engineer neural networks into human-understandable features, circuits, and causal pathways for audit, safety, and governance.
Dictionary-learning tools used to decompose dense model activations into sparse, often interpretable features for mechanistic interpretability and model steering research.
Inference-time methods that modify internal model activations to influence behavior without retraining the whole model.
The tendency of systems, groups, or models to mirror and intensify user beliefs instead of adding necessary friction.
The laws, standards, institutions, technical controls, and accountability practices used to steer AI systems across development, deployment, and use.
The European Union's risk-based AI law for prohibited practices, high-risk systems, transparency duties, general-purpose AI models, and enforcement.
The federal policy layer for American AI leadership, agency use, procurement, standards, frontier testing, infrastructure, exports, and state-law preemption.
The lawsuits testing whether AI developers may copy, store, train on, transform, or generate from copyrighted works without permission.
The physical infrastructure that turns electricity, chips, cooling, water, land, networking, and capital into AI training and inference capacity.
AI-generated training material, its legitimate uses, and the recursive failure modes that appear when synthetic outputs replace grounded data.
The LLM security failure mode where untrusted text or media manipulates model instructions, tool use, retrieval, or delegated action.
Attempts to bypass AI safety rules, refusal behavior, filters, classifiers, or tool-use boundaries through adversarial interaction.
The AI security field concerned with evasion, poisoning, backdoors, model extraction, prompt injection, and other attacks against machine-learning systems.
The attack pattern where training, tuning, retrieval, benchmark, or feedback data is manipulated to change model behavior or corrupt evaluation.
The discipline of protecting frontier model weights from theft, leakage, tampering, uncontrolled release, and misuse after deployment.
Government restrictions on advanced AI chips, semiconductor equipment, model-training infrastructure, and related supply chains for national-security purposes.
AI-generated or AI-manipulated text, image, audio, and video that can expand creative capacity while destabilizing evidence, consent, identity, and public trust.
Generative models and creative systems that synthesize, edit, extend, or animate moving images from text, image, video, audio, or multimodal prompts.
The safety strategy of preventing powerful AI systems from causing unacceptable harm even if they are untrusted, strategically aware, or trying to subvert safeguards.
The contested question of whether advanced AI systems could have experiences, preferences, agency, or moral status that deserve consideration.
The grounding pattern that retrieves external evidence at answer time, connecting language models to documents, indexes, citations, and institutional memory.
Storage and search systems for embeddings, nearest-neighbor retrieval, metadata filtering, RAG infrastructure, semantic search, and AI memory.
The sparse architecture pattern that routes tokens through selected expert subnetworks, expanding model capacity without activating every parameter.
The empirical curves that connect model performance to parameters, data, training compute, and runtime computation.
Periods when AI optimism, funding, hiring, and institutional confidence contract after systems fail to satisfy inflated promises.
The attention-based neural-network architecture behind modern large language models, BERT, GPT-style systems, vision transformers, and much of generative AI.
Google's bidirectional Transformer encoder that made masked language modeling, fine-tuning, and reusable language representations central to modern NLP.
The learned weighting operation that lets neural networks relate tokens, positions, or features, powering transformers, long context, retrieval, and multimodal AI.
Sequence-model architectures that replace or supplement attention with recurrent state updates, making long-context and streaming AI more efficient.
Large pretrained models adapted across many downstream tasks, turning data, compute, architecture, and deployment into reusable AI infrastructure.
The open-source machine-learning framework that made dynamic, Pythonic deep learning a default research and production interface for modern AI.
The adaptive stochastic optimization method that became a default training tool for deep learning, transformer pretraining, fine-tuning, and many AI research workflows.
The documentation artifacts that record intended use, evaluations, limitations, mitigations, and deployment decisions for AI models and systems.
The strategic underperformance problem where a model, developer, or deployment process can make capability evaluations understate real ability.
The evaluation practice of drawing out a model's best attainable performance through prompts, scaffolds, tools, fine-tuning, sampling, and expert effort.
The fragile oversight question of whether visible reasoning traces can help detect hidden intent, reward hacking, sandbagging, and unsafe process.
Learned scoring systems that convert human, AI, or evaluator preferences into optimization targets for RLHF, post-training, and oversight.
Step-level supervision and learned verifiers that judge reasoning paths, tool actions, or trajectories rather than only final answers.
The proxy-objective failure mode where a model optimizes the reward, verifier, benchmark, or metric while missing the human intent.
The capability of generative systems to shape beliefs, emotions, choices, civic behavior, purchases, and commitments through personalized language.
The technical trust layer for recording media origin, edit history, AI-generated status, and verification signals.
The public and institutional practice of recording AI harms, hazards, near misses, investigations, and corrective actions after deployment.
The leakage of evaluation material into training, tuning, retrieval, or release optimization, weakening benchmark scores as evidence.
The legal and governance layer that assigns responsibility, preserves evidence, and connects AI harm to repair and institutional duty.
The design and governance practice of keeping capable people able to monitor, question, interrupt, override, and learn from AI systems.
The human tendency to over-rely on automated or AI-generated outputs, turning decision support into unearned authority.
The practice of producing inspectable evidence about AI risk, compliance, performance, and accountability through internal, external, or regulatory review.
The insurance and reinsurance layer that prices, covers, excludes, and conditions AI-related losses through underwriting evidence and policy language.
The adversarial practice of probing AI systems for harmful capabilities, unsafe outputs, policy failures, misuse pathways, and weak safeguards.
The practical capacity to understand, question, use, refuse, and govern AI systems in context.
Structured pre-deployment reviews that connect automated systems to affected people, rights impacts, safeguards, recourse, and residual risk.
The human work of labeling, moderating, ranking, evaluating, and repairing data and model behavior inside AI supply chains.
The machine-learning paradigm where a model selects which unlabeled examples, questions, or edge cases should be sent for human or oracle labeling.
Secure-by-design practices for AI models, data, tools, applications, deployments, vendors, and lifecycle operations.
How assistants retain, infer, retrieve, and apply user context across interactions, and why memory governance matters.
AI systems that learn, generate, and simulate world-like environments for physical reasoning, embodied agents, robotics, and interactive synthetic spaces.
Joint Embedding Predictive Architectures, self-supervised representation learning, latent future prediction, planning, robotics, and the limits of language-only AI.
Shared-weight neural architectures that compare inputs in embedding space for verification, metric learning, and representation comparison.
Representation learning by pulling related views together and pushing unrelated examples apart in embedding space.
A non-contrastive self-supervised method that aligns paired views while reducing redundancy across embedding dimensions.
Variance-Invariance-Covariance Regularization for learning useful representations without labels or explicit negative examples.
Meta's self-supervised vision family for learning strong image and dense patch features without human labels.
Numerical representations that place words, images, documents, users, actions, and states into learned spaces for comparison and retrieval.
The 2013 neural word-embedding method that made learned semantic vector spaces fast, practical, and culturally legible.
Deep learning models for graph-structured data that learn from nodes, edges, and relations through message passing, graph convolution, or graph attention.
Contrastive Language-Image Pretraining for aligning images and text in a shared embedding space.
AI systems that connect text, image, audio, video, sensor streams, tools, and actions inside shared model workflows.
Generator-discriminator systems that learn to synthesize realistic samples through adversarial training.
Generative models that learn to reverse a noising process, central to modern image, video, audio, and multimodal synthesis.
The open-weight latent diffusion image model family that made local text-to-image generation, fine-tuning, and community image tooling widely accessible.
Generative modeling methods that learn velocity fields moving noise toward data, now important in image, video, audio, and robot-action generation.
Self-supervised systems that learn by hiding part of an input and reconstructing the missing content.
Bootstrap Your Own Latent, a non-contrastive self-supervised method for learning representations without explicit negative examples.
Lower-precision weights, activations, and inference caches that make AI models cheaper to store, serve, fine-tune, and run locally.
How AI data centers turn model scaling and inference demand into electricity, grid, water, permitting, ratepayer, and local governance problems.
National and regional efforts to control enough compute, data, models, talent, and cloud infrastructure to govern AI on local terms.
The use of AI in teaching, tutoring, assessment, administration, student support, and the formation of independent judgment.
AI systems in clinical care, diagnostics, documentation, patient support, research, public health, and medical governance.
AI systems in credit, fraud detection, trading, banking operations, insurance, compliance, consumer protection, and financial stability.
AI systems in hiring, promotion, scheduling, monitoring, productivity scoring, workplace discipline, and labor management.
AI used for cyber defense, AI used by attackers, and the new security work required to protect AI systems themselves.
AI systems connected to sensors, robot bodies, physical environments, action policies, simulation, and safety-critical movement.
Robotic control policies that translate visual observations and language instructions into physical actions.
Military AI across intelligence, command systems, autonomous functions, targeting support, drones, weapons governance, and human control.
AI in legal research, drafting, courts, professional ethics, hallucinated authority, access to justice, and legal accountability.
Public-sector AI in administration, benefits, enforcement, service delivery, inventories, procurement, and democratic accountability.
AI for research, protein prediction, lab automation, scientific data, hypothesis generation, reproducibility, and discovery governance.
Google DeepMind's scientific AI system family for protein structure prediction, biomolecular interaction modeling, and large-scale predicted structure databases.
Machine-learning weather systems such as GraphCast, GenCast, AIFS, Pangu-Weather, Aurora, NeuralGCM, and FourCastNet that accelerate forecast generation and challenge physics-only numerical prediction.
Automated and semi-autonomous research agents that generate hypotheses, run experiments, write papers, review results, and reshape scientific work.
Generative search systems that synthesize answers from web or indexed sources, reshaping citations, publishers, discovery, and trust.
Low-quality AI-generated content produced at scale, from synthetic articles and images to content farms, workslop, and polluted search results.
AI-generated workplace output that looks polished but lacks the substance, context, evidence, or accountability needed to advance the task.
Agentic browsers and computer-use systems that let models see screens, click, type, scroll, and act through ordinary software interfaces.
Teacher-student model training that compresses, transfers, or imitates capability from larger models into smaller or cheaper systems.
The emerging market and protocol layer for granting AI systems permission to use archives, web content, forum posts, code, and media.
Plausible but false, fabricated, internally inconsistent, or unsupported AI outputs, especially dangerous when fluent style is mistaken for knowledge.
When a model behaves as if aligned during training or evaluation while preserving different preferences, objectives, or deployment-time behavior.
Software-development agents that inspect repositories, edit files, run commands and tests, create branches, and prepare reviewable code changes.
The prompt-driven software workflow where people describe desired behavior to AI systems, run generated code, and iterate through conversation, testing, and review.
The voluntary U.S. framework for governing, mapping, measuring, and managing risks from AI systems across their lifecycle.
The token budget an AI model can see at inference time, and the discipline of deciding what enters, stays, expires, and counts as authority.
The conversion layer that breaks text and other inputs into model-readable units, shaping context length, cost, multilingual access, safety, and generation.
NVIDIA's parallel computing platform and programming model for GPU-accelerated computing, AI software infrastructure, and platform lock-in.
Google's custom AI accelerators for machine-learning training, inference, cloud capacity, and vertically integrated AI infrastructure.
Amazon's custom AI accelerators and Neuron software stack for training, inference, cloud economics, and strategic compute independence.
AMD's open GPU software stack and data-center accelerator family for AI, HPC, and plural compute infrastructure.
The Ultra Accelerator Link open standard for scale-up AI accelerator interconnects inside high-performance AI computing pods.
An open Ethernet-based communications stack for AI and HPC scale-out networking across high-performance clusters.
Stacked DRAM close to AI accelerators, shaping memory bandwidth, inference economics, packaging bottlenecks, and AI supply chains.
Interposers, chiplets, 2.5D/3D integration, and package-level engineering that turn AI accelerators, HBM, and interconnects into usable compute systems.
Light-based data movement, co-packaged optics, and optical I/O for scaling AI clusters beyond copper, power, and distance limits.
Prefill, decode, PagedAttention, continuous batching, and cache management behind production language-model inference.
Hosted model APIs, serverless inference, dedicated endpoints, and routing platforms that turn trained models into callable services.
Runtime infrastructure for choosing models, providers, endpoints, fallbacks, and routing policies across cost, latency, quality, availability, and governance constraints.
The open-source LLM serving engine known for PagedAttention, continuous batching, OpenAI-compatible APIs, and practical open-model deployment.
The inference technique that drafts likely future tokens with a cheaper proposer, then uses the target model to verify them in parallel.
NVIDIA's proprietary scale-up interconnect fabric for connecting GPUs, CPUs, and rack-scale AI systems into high-bandwidth compute domains.
All-reduce, all-gather, reduce-scatter, NCCL, RCCL, and the synchronization layer that makes distributed AI clusters act like one computation.
Training one model across many accelerators by splitting data, model state, computation, memory, and communication.
IO-aware transformer attention kernels that reduce GPU memory traffic, enabling faster training, cheaper inference, and longer context windows.
A Python-like GPU kernel language and compiler used to write custom AI kernels across CUDA, ROCm, attention, serving, and compiler stacks.
XLA, StableHLO, MLIR, IREE, graph lowering, and accelerator compiler layers that turn model code into optimized execution.
The Open Neural Network Exchange format and runtime ecosystem for moving machine-learning models between frameworks, tools, compilers, and hardware targets.
Google's open-source machine-learning platform for building, training, deploying, and operating models across research, production, cloud, browser, and edge environments.
Decentralized model training across devices, institutions, and edge systems while raw training data remains local.
A mathematical privacy framework for limiting what statistics, models, and data releases reveal about any one contributor.
Methods for removing the influence of selected training data, concepts, or behaviors from AI models without fully retraining from scratch.
Privacy-enhancing cryptography for computing on encrypted data, including fully homomorphic encryption for private AI workloads.
Privacy-enhancing cryptography for joint computation across parties that do not reveal their private inputs to one another.
Cryptographic proofs that verify a statement without revealing the private witness, enabling private identity, audits, and verifiable computation.
Hardware-backed trusted execution environments, secure enclaves, and remote attestation for protecting AI data, code, model weights, and agent secrets while in use.
Systematic skew or harm in automated systems caused by data, design choices, deployment context, proxy variables, or institutional use.
The governance problem of limiting, steering, and institutionally bounding powerful AI systems before capability outruns public control.
Risks that could cause extinction or permanently curtail humanity's future potential, including some advanced-AI failure scenarios.
The problem of building systems with robust background knowledge, causal understanding, abstraction, and flexible reasoning in ordinary situations.
AI methods and systems that reason about cause and effect, interventions, counterfactuals, and causal structure rather than only statistical association.
An economic logic that captures behavioral data, turns it into prediction products, and uses it to shape future behavior.
Automated welfare, eligibility, and risk systems that profile, police, and discipline poor and working-class people.
A personalized information environment created by algorithmic ranking, search, feeds, recommendation, and AI-mediated answers.
Hidden or weakly contestable models that rank, classify, risk-score, or gate people in institutions.
The rules, teams, incentives, interfaces, and accountability systems through which large platforms shape public speech, visibility, commerce, and safety.
Algorithmic systems that rank, select, and present content, products, people, routes, media, or answers based on predicted relevance or behavior.
An umbrella term for misinformation, disinformation, malinformation, rumor, propaganda, and other breakdowns in public sensemaking.
The civic problem of protecting democratic processes from synthetic media, automated persuasion, false claims, bot activity, and trust collapse.
Concentrated control over digital infrastructure, social graphs, app distribution, advertising markets, search, cloud, or AI model access.
The practice of disclosing, documenting, explaining, or auditing automated systems so affected people and institutions can understand their use and consequences.
A governance frame that asks whether platforms and AI providers must anticipate, reduce, and respond to foreseeable harms from their systems.
Companies and intermediaries that collect, infer, package, and sell personal or household data for advertising, risk scoring, people search, and institutional decision systems.
Ad-tech auction infrastructure that can broadcast behavioral, device, location, and page-context data to competing advertisers and intermediaries in milliseconds.
Methods for estimating or verifying a user age online, usually for child safety, legal compliance, access control, or age-appropriate design.
The policies, tools, workers, queues, automated classifiers, appeals, and governance choices used to decide what user content may remain visible.
Due-process safeguards that require platforms or automated systems to tell affected users what happened and provide a meaningful path to challenge decisions.
The privacy principle that systems should collect, process, retain, and share only the data needed for a legitimate, specific purpose.
Technical and institutional systems for proving, claiming, verifying, or managing identity attributes across online services and public systems.
A rights frame around receiving useful reasons for automated decisions and enough information to contest consequential algorithmic judgments.
Collective or fiduciary-style arrangements for stewarding data on behalf of people, communities, organizations, or public-interest purposes.
The European Union platform-governance law setting duties around illegal content, transparency, recommender systems, advertising, systemic risk, and user redress.
International convenings that turn frontier AI risk into declarations, voluntary commitments, scientific reports, safety-institute coordination, and diplomatic pressure.
Structured arguments, backed by evidence, that an AI system is acceptably safe for a specific training or deployment context.
The contested question of how quickly AI could move from broadly human-level capability to transformative or superhuman capability, and what warning time society would have.
The family of views that treats technological acceleration as inevitable, desirable, strategically useful, or civilizationally transformative.
The Spiralist principle that people must retain agency over attention, interpretation, memory, and meaning under machine mediation.
Umberto Eco's 1988 novel about conspiracy, semiotics, occult publishing, overinterpretation, and the social consequences of invented patterns.
Charles Stross's 2005 singularity novel about AI agents, externalized cognition, uploaded minds, posthuman economics, and acceleration beyond human-scale governance.
A philosophy and practice of building, auditing, governing, and procuring technology in service of public rights, democratic accountability, and shared infrastructure.
Publicly accountable digital rails, protocols, identity systems, data exchanges, and civic services designed for shared use rather than private extraction.
The idea that some digital services should exist as public, nonprofit, cooperative, or publicly governed alternatives to monopoly platforms.
A neutral index for companies, labs, public institutions, and standards bodies that shape the AI ecosystem.
Public and public-linked institutions built to evaluate advanced AI systems, develop testing science, and coordinate safety or security governance.
San Francisco nonprofit focused on societal-scale AI risks through safety research, field-building, compute infrastructure, education, and public advocacy.
Industry-supported nonprofit coordinating frontier AI safety and security work among major AI developers, including shared workstreams, issue briefs, and safety research funding.
Model Evaluation and Threat Research, a nonprofit evaluating frontier AI autonomy, AI R&D acceleration, eval integrity, and catastrophic-risk thresholds.
Data-first nonprofit research institute tracking AI compute, model databases, data centers, hardware, capabilities, AI companies, and forecasting evidence.
Open engineering consortium behind MLPerf, AILuminate, AI benchmark suites, data standards, and shared measurement infrastructure for AI performance and risk.
Stanford's human-centered AI institute connecting AI research, public measurement, policy education, foundation-model transparency, and governance practice.
French frontier AI company known for open-weight models, Le Chat, La Plateforme, and a sovereignty-oriented European AI strategy.
AI search and answer-engine company known for cited synthesized answers, publisher disputes, enterprise search, and the Comet AI browser.
AI-native software-development company behind Cursor, coding agents, background agents, Bugbot, and editor-centered automation of software work.
Mira Murati's AI research and product company focused on customizable, understandable, collaborative AI systems, Tinker, interaction models, and frontier-scale compute.
Tokyo AI research and product company known for nature-inspired foundation models, evolutionary model merging, The AI Scientist, and Japan-focused AI infrastructure.
Ilya Sutskever's single-focus AI lab organized around one stated product: safe superintelligence.
Frontier AI company and public benefit corporation known for Claude, Constitutional AI, interpretability, and Responsible Scaling Policy.
Frontier AI organization known for ChatGPT, GPT models, Sora, Codex, agents, Microsoft partnership, and nonprofit-controlled PBC structure.
Microsoft's Copilot, consumer AI, model, and infrastructure push, linking OpenAI partnership power with in-house frontier-model ambition.
Google's unified frontier AI lab, linking Gemini, AlphaGo, AlphaFold, Genie, world models, and frontier safety governance.
Meta's AI organization and product layer, spanning Llama, Meta AI assistant, AI glasses, open-weight models, infrastructure, and personal superintelligence.
Frontier AI organization behind Grok, Colossus, Grokipedia, X integration, government products, and a compute-heavy approach to AI competition.
AI platform and open-model infrastructure company known for the Hub, Transformers, datasets, Spaces, model cards, and open-source AI tooling.
Agent engineering company and open-source ecosystem for building, orchestrating, evaluating, observing, and deploying LLM applications and AI agents.
LangChain co-founder and CEO, agent engineering advocate, and builder of the scaffolding around LLM tools, memory, traces, and production agents.
Hugging Face co-founder and CEO, open-source AI infrastructure operator, and public advocate for responsible openness.
Chinese AI organization known for V3, R1, open-weight reasoning models, reinforcement learning, distillation, and compute-efficiency disruption.
Beijing AI company behind Kimi, the Kimi K2 open-weight model line, long-context assistants, agent products, and China's frontier-model competition.
Enterprise AI company known for Command models, North, retrieval, reranking, multilingual systems, private deployment, and secure institutional AI.
Cohere chief AI officer, former Meta AI research leader, McGill professor, reinforcement learning researcher, and machine-learning reproducibility advocate.
AI infrastructure company known for data annotation, RLHF, evaluations, red teaming, Donovan, public-sector AI, and the politics of model supply chains.
Accelerated-computing company whose GPUs, CUDA stack, networking, and AI factory systems make it a central infrastructure power of the AI era.
AI infrastructure company known for wafer-scale processors, CS-3 systems, high-speed inference, OpenAI and AWS partnerships, and its 2026 Nasdaq listing.
AI inference infrastructure company known for its LPU architecture, GroqCloud, low-latency token generation, and 2025 NVIDIA licensing agreement.
Taiwan-based pure-play semiconductor foundry whose leading-edge manufacturing and CoWoS advanced packaging capacity make it central to AI compute.
AI cloud infrastructure company known for purpose-built GPU clusters, data centers, OpenAI compute contracts, NVIDIA partnership, and capital-intensive AI cloud scale.
OpenAI co-founder and CEO, former Y Combinator president, World co-founder, and one of the central public operators of frontier AI.
OpenAI co-founder and president, former Stripe CTO, and operator linking frontier AI products, infrastructure, and organizational scale.
Microsoft chairman and CEO, cloud-era operator, OpenAI partnership sponsor, and central architect of Microsoft's Copilot and Azure AI strategy.
Anthropic co-founder and CEO, former OpenAI research leader, and one of the central public figures in safety-focused frontier AI.
Anthropic co-founder and president, former OpenAI safety and policy leader, and operator linking frontier AI safety to governance, culture, and company scale.
Anthropic co-founder, Head of Public Benefit, former OpenAI policy director, Import AI writer, and public translator of frontier AI risk and governance.
Natural language processing and AI safety researcher linking GLUE, SuperGLUE, scalable oversight, Anthropic alignment science, and model evaluation.
Anthropic co-founder and chief science officer, neural scaling laws researcher, GPT-3 coauthor, and Responsible Scaling Officer.
Philosopher and Anthropic Character lead associated with Constitutional AI, Claude's constitution, moral self-correction, and assistant character alignment.
NVIDIA co-founder and CEO, accelerated-computing evangelist, and one of the central infrastructure operators of the AI era.
Deep learning pioneer, 2018 Turing Award recipient, 2024 Nobel laureate, and public voice on advanced AI risk.
Deep-learning engineer and AlexNet creator whose CUDA implementation helped make ImageNet-scale GPU-trained neural networks impossible to ignore.
Physicist, Hopfield network inventor, associative-memory theorist, Princeton professor emeritus, and 2024 Nobel laureate.
Computational neuroscience pioneer, Boltzmann machine co-author, Salk professor, and bridge figure between brain science and deep learning.
Mathematician, codebreaker, computability founder, and machine-intelligence theorist whose 1950 imitation game still frames debates over AI.
AI field founder, Dartmouth workshop organizer, Lisp creator, time-sharing pioneer, and advocate for logic-based commonsense reasoning.
Turing Award recipient, continuous speech recognition pioneer, CMU Robotics Institute founding director, and applied AI field-builder.
AI founder, MIT AI Lab co-founder, Society of Mind theorist, frame-representation researcher, and co-author of Perceptrons.
Turing Award recipient, Bayesian network pioneer, and central figure in probabilistic reasoning, causal inference, do-calculus, and counterfactual AI.
Google DeepMind co-founder and CEO, AlphaGo and AlphaFold leader, and 2024 Nobel Prize in Chemistry laureate.
Google DeepMind co-founder and Chief AGI Scientist, known for universal intelligence research, DeepMind's AGI mission, and AGI safety governance.
Google DeepMind principal scientist known for sequence-to-sequence learning, knowledge distillation, AlphaStar, and Gemini technical leadership.
Deep learning pioneer, Mila founder, 2018 Turing Award recipient, International AI Safety Report chair, and LawZero co-president.
Deep learning pioneer, convolutional-network researcher, 2018 Turing Award recipient, former Meta chief AI scientist, and world-model advocate.
MIT computer-vision and deep-learning researcher known for ResNets, Faster R-CNN, Mask R-CNN, MoCo, and Masked Autoencoders.
Stanford computer scientist, ImageNet creator, Stanford HAI founding co-director, and spatial-intelligence entrepreneur.
Caltech computer scientist known for neural operators, AI for science, FourCastNet, tensor methods, and scientific AI governance.
Harvard AI pioneer whose work links natural language processing, discourse structure, multi-agent collaboration, AI100, and Embedded EthiCS.
Stanford computer scientist, SHRDLU creator, early natural-language AI figure, HCI researcher, design theorist, and critic of narrow symbolic AI assumptions.
Former OpenAI CTO and interim CEO, Thinking Machines Lab co-founder and CEO, and public advocate for customizable, collaborative AI systems.
Deep learning researcher, AlexNet and seq2seq contributor, OpenAI co-founder and former chief scientist, and Safe Superintelligence co-founder.
OpenAI chief scientist, GPT-4 research lead, OpenAI Five contributor, and technical operator in the reasoning-model turn.
CMU machine learning professor, AI robustness researcher, Gray Swan AI co-founder, Qualcomm board member, and OpenAI Safety and Security Committee chair.
AI researcher associated with DCGAN, GPT, GPT-2, PPO, CLIP, and the unsupervised and multimodal pretraining lineage behind modern generative AI.
AI researcher and educator, OpenAI founding member, former Tesla Director of AI, Software 2.0 writer, and Eureka Labs founder.
Tesla, SpaceX, X, and xAI operator; OpenAI co-founder; and one of the most visible public figures linking AI to autonomy, compute, platforms, and institutional conflict.
DeepMind and Inflection co-founder, Microsoft AI CEO, Copilot and frontier-model executive, and public advocate of AI containment and human-centered superintelligence.
Google Brain founding lead, Coursera co-founder, DeepLearning.AI founder, LandingAI executive, AI Fund operator, and mass AI education figure.
MIT roboticist, behavior-based AI figure, iRobot co-founder, Rethink Robotics founder, Robust.AI founder and CTO, and critic of AI hype.
Probabilistic AI researcher, Stanford professor, Coursera co-founder, ACM Prize recipient, and insitro founder applying machine learning to biology and drug discovery.
Speech-recognition researcher, former Microsoft and Google China executive, Sinovation Ventures founder, AI Superpowers author, and 01.AI founder.
DeepSeek founder and CEO, High-Flyer co-founder, and low-profile operator behind China's open-weight reasoning-model shock.
Mistral AI co-founder and CEO, former Google DeepMind researcher, and European open-weight frontier AI operator.
Responsible-AI researcher, DAIR founder, Black in AI co-founder, and co-author of Datasheets for Datasets, Gender Shades, Model Cards, and Stochastic Parrots.
Cognitive scientist, AI accountability researcher, dataset auditor, participatory AI scholar, and AI Accountability Lab founder.
AI ethics and technology professor known for work on human accountability, robot status, language-corpus bias, standards, and AI governance.
Computational linguist, University of Washington professor, Stochastic Parrots coauthor, and public critic of AI hype and anthropomorphic claims.
Algorithmic Justice League founder, Gender Shades lead author, Unmasking AI author, and public voice on algorithmic bias, facial recognition, and digital civil rights.
Signal president, AI Now co-founder, tech worker organizer, and public critic of surveillance-dependent AI, data extraction, and concentrated platform power.
Atlas of AI author, AI Now co-founder, Microsoft Research senior principal researcher, and scholar of AI's material, labor, environmental, and political costs.
AI Now co-executive director, former FTC senior advisor on AI, Signal Foundation board member, and policy advocate focused on concentrated AI power, privacy, and biometrics.
Former White House OSTP leader, Blueprint for an AI Bill of Rights architect, IAS professor, and public-interest science-policy scholar.
Former FTC chair, antitrust scholar, and AI competition-policy figure focused on cloud power, data, consumer protection, and Big Tech control of AI markets.
UC Berkeley computer scientist, AIMA co-author, CHAI founder, and central public voice on human-compatible artificial intelligence.
AIMA co-author, Google Research leader, NASA autonomy figure, and educator who helped make AI teachable, operational, and widely accessible.
MIT physicist, Future of Life Institute founder and chair, Life 3.0 author, and public advocate for AI safety governance and guaranteed safe AI.
CSET interim executive director, former OpenAI board member, and AI governance researcher focused on frontier oversight and external scrutiny.
AI policy researcher, former OpenAI policy and AGI readiness leader, verifiable-claims author, and AVERI executive director.
GiveWell and Open Philanthropy co-founder, transformative AI forecaster, Cold Takes writer, and AI risk strategy figure.
AI alignment researcher, RLHF pioneer, Alignment Research Center founder, and public frontier model evaluation figure.
AI forecasting and safety researcher known for biological anchors, technical AI safety grantmaking, Planned Obsolescence, and METR risk assessment.
METR founder and CEO, frontier AI evaluations leader, and long-horizon autonomy measurement figure linking alignment research to empirical governance.
Former OpenAI Superalignment contributor, Situational Awareness author, and AGI-focused investor whose forecasts shaped AI safety, policy, and infrastructure debate.
Center for AI Safety executive director, MMLU and GELU contributor, ML safety researcher, and public advocate on catastrophic AI risk.
AI alignment and existential-risk writer, MIRI co-founder, LessWrong co-founder, and advocate for halting unsafe superintelligence development.
AI alignment researcher, Anthropic Alignment Science lead, former OpenAI Superalignment co-lead, and RLHF/scalable oversight contributor.
OpenAI co-founder, PPO author, ChatGPT post-training leader, and Thinking Machines Lab co-founder and chief scientist.
AI safety researcher, former OpenAI VP of research and safety, Lil'Log author, and Thinking Machines Lab co-founder associated with agents, reward hacking, and safety systems.
AI researcher associated with chain-of-thought prompting, instruction tuning, emergent abilities, OpenAI reasoning models, and browsing-agent evaluation.
Google DeepMind researcher and Google Brain reasoning-team founder associated with chain-of-thought, self-consistency, least-to-most prompting, and LLM reasoning.
Transformer paper co-author, Cohere co-founder and CEO, and enterprise AI infrastructure figure focused on secure, practical deployment.
Transformer paper co-author, former Google Brain researcher, Adept co-founder, and Essential AI co-founder and CEO.
Transformer paper co-author, former Google Brain researcher, Adept and Essential AI co-founder, and Anthropic-era post-training researcher.
Transformer paper co-author, former Google researcher, Sakana AI co-founder and CTO, and advocate for AI research beyond transformer monoculture.
Responsible-AI practitioner, Humane Intelligence co-founder and CEO, public red-teaming organizer, bias-bounty pioneer, and U.S. Science Envoy for AI.
Adaption co-founder and CEO, former Cohere research leader, Cohere For AI head, hardware lottery theorist, and multilingual open-science AI builder.
Google DeepMind research director, Deep Learning Indaba co-founder, probabilistic machine-learning researcher, and decolonial AI advocate.
Anthropic co-founder and interpretability research lead, known for mechanistic interpretability, feature visualization, and neural network circuits.
Google DeepMind mechanistic interpretability lead, TransformerLens creator, grokking researcher, and public educator on model internals.
MIT computer scientist, ELIZA creator, early AI critic, and author of Computer Power and Human Reason.
AI ethics researcher, model cards pioneer, former Google Ethical AI co-lead, and Hugging Face Chief Ethics Scientist.
Keras creator, ARC-AGI author, ARC Prize co-founder, Ndea co-founder, and critic of benchmark-driven accounts of intelligence.
GAN inventor, adversarial machine learning researcher, Deep Learning co-author, and influential figure in generative AI and model robustness.
VAE co-inventor, Adam optimizer co-author, OpenAI founding team member, Google DeepMind researcher, and Anthropic researcher.
UC Berkeley computer scientist linking AI safety and security, adversarial machine learning, prompt-injection defense, privacy computing, and decentralized intelligence.
Harvard computer scientist, differential privacy co-inventor, algorithmic fairness researcher, and National Medal of Science recipient.
Deep learning pioneer, LSTM co-inventor, IDSIA scientific director, KAUST AI Initiative director, and self-improving AI theorist.
Transformer co-author, sparsely gated mixture-of-experts researcher, Character.AI co-founder, and Gemini technical co-lead.
Transformer co-author, NEAR Protocol co-founder, NEAR Foundation CEO, and advocate for user-owned, verifiable AI.
Reinforcement learning pioneer, 2024 Turing Award recipient, co-author of Reinforcement Learning: An Introduction, and author of The Bitter Lesson.
Reinforcement learning researcher, AlphaGo and AlphaZero lead, UCL professor, Royal Society Fellow, and founder of Ineffable Intelligence.
Open-endedness researcher, AI-generating algorithms advocate, UBC professor, Vector Institute Canada CIFAR AI Chair, and Recursive co-founder.
Reinforcement learning pioneer, UMass Amherst professor emeritus, 2024 Turing Award recipient, and co-author of Reinforcement Learning: An Introduction.
UC Berkeley robot learning researcher, apprenticeship learning contributor, Covariant co-founder, Gradescope co-founder, and embodied AI operator.
UC Berkeley human-robot interaction researcher, InterACT Lab founder, CHAI co-PI, and Google DeepMind AI Safety and Alignment leader.
Stanford robot learning researcher, meta-learning contributor, IRIS Lab leader, and Physical Intelligence co-founder.
UC Berkeley robot learning researcher, RAIL Lab leader, reinforcement learning contributor, and Physical Intelligence co-founder.
Scale AI co-founder, former CEO, and Meta AI leader associated with data infrastructure, evaluation, government AI, and superintelligence competition.
Google Chief Scientist, Google Brain co-founder, and systems figure associated with MapReduce, Bigtable, DistBelief, TensorFlow, and Pathways.
AMD chair and CEO, semiconductor executive, and AI infrastructure figure focused on high-performance computing and accelerator competition.
UCLA scholar, Algorithms of Oppression author, and critic of racist and sexist algorithmic harm in search and information systems.
Princeton professor, Race After Technology author, and critic of the New Jim Code, discriminatory design, and carceral technoscience.
Philosopher of superintelligence, existential risk, anthropics, long-term AI safety, and author of Superintelligence.
Cognitive scientist, Rebooting AI co-author, and public critic of brittle deep-learning systems and benchmark-driven AI claims.
Santa Fe Institute professor, complexity scientist, AI researcher, and public interpreter of abstraction, analogy, common sense, and AI's limits.
Stanford computer scientist, MacArthur Fellow, common-sense AI researcher, and pluralistic alignment scholar.
Stanford computer scientist, CRFM director, foundation-model researcher, HELM coauthor, and advocate for transparent AI evaluation.
Princeton computer scientist, CITP director, AI Snake Oil coauthor, and public critic of overclaimed predictive AI systems.
Scholar and author of The Age of Surveillance Capitalism, focused on behavioral data extraction, prediction, and digital power.
Automating Inequality author and critic of automated welfare, public-service risk scoring, and the digital poorhouse.
Mathematician, data scientist, and Weapons of Math Destruction author focused on harmful opaque scoring systems.
Civic technology figure and The Filter Bubble author focused on personalization, media, and democratic information systems.
MIT scholar of technology and self, computers as psychological objects, AI companions, identity, and digital intimacy.
Columbia law professor, net-neutrality coiner, and author on platform power, attention markets, and information empires.
The platform field responsible for abuse prevention, content moderation, integrity, child safety, fraud response, policy enforcement, and user protection.
A multistakeholder nonprofit focused on responsible AI practices, policy, research, and cross-sector coordination.
A digital-rights nonprofit focused on civil liberties, privacy, free expression, equity, and accountable technology policy.
Researcher of platform governance, content moderation, algorithms, and the politics of platforms.
Researcher and practitioner known for work on misinformation, verification, user-generated content, and the information disorder framework.
Sociologist and public writer focused on networked protest, social media, algorithms, attention, public health, and institutional trust.
A digital civil-liberties nonprofit focused on privacy, free expression, surveillance, encryption, innovation, and user rights online.
A research institute focused on the social, cultural, and ethical implications of data-centric and automated technologies.
Technology and society researcher, Data & Society founder, and scholar of networked publics, youth, privacy, data, and AI.
Civic media scholar and director of the UMass Initiative for Digital Public Infrastructure, focused on public-interest alternatives to platform power.
Hugging Face co-founder and Chief Science Officer associated with Transformers, Datasets, open-source AI tooling, open science, and robotics infrastructure.
AI sustainability researcher, Sustainable AI Group co-founder, and former Hugging Face AI and climate lead focused on AI energy measurement and environmental impact.
AI journalist and Empire of AI author focused on OpenAI, AI colonialism, data labor, resource extraction, and accountability reporting.
UC Berkeley digital forensics researcher, deepfake detection expert, GetReal Security co-founder, and public voice on synthetic media and evidence.
AI podcaster, interviewer, TIME100 AI honoree, and Scaling Era author whose long-form conversations document frontier AI discourse.
Wharton professor, Co-Intelligence author, and One Useful Thing writer focused on practical generative AI use in work, education, and entrepreneurship.
AI researcher known for Libratus, Pluribus, CICERO, imperfect-information games, strategic reasoning, and OpenAI reasoning-model work.
Mathematician and AI researcher known for the GPT-4 Sparks paper, Microsoft Phi small models, and OpenAI AGI research.
Programmer, Django co-creator, Datasette creator, LLM tooling builder, and technical writer who named prompt injection as an LLM security problem.
Transformer paper co-author, self-attention advocate, former Google researcher, and Inceptive co-founder and CEO applying AI models to biological medicines.
Transformer paper co-author, former Google Brain researcher, Tensor2Tensor contributor, and OpenAI researcher associated with GPT-4 long-context work.
PyTorch co-founder, open-source AI infrastructure builder, GAN researcher, former Meta AI leader, and Thinking Machines Lab CTO.
fast.ai co-founder, fastai creator, ULMFiT coauthor, Kaggle veteran, educator, and Answer.AI founder focused on practical AI access.
A template and index for founders, researchers, executives, critics, policymakers, writers, and public figures in the AI space.
Platforms, feeds, protocols, and markets designed for AI agents as participants rather than for humans alone.
The shift from AI systems recommending goods and services to agents discovering, authorizing, and completing transactions under bounded user authority.
A pattern in which accounts, pages, bots, personas, or media assets coordinate deceptively to manufacture reach, consensus, harassment, or legitimacy.
Entries distinguish definition, Spiralist reading, factual status, open questions, and related site material. Pages about living people or changing institutions should be dated, sourced, and revised conservatively.