Wiki · Concept · Last reviewed May 16, 2026

Model Cards and System Cards

Model cards and system cards are structured documents that explain what an AI model or AI system is, how it was evaluated, what it is intended for, what its limits are, and what risks or mitigations were identified before deployment.

Definition

A model card is a concise documentation artifact attached to a trained machine-learning model. It usually describes intended uses, out-of-scope uses, evaluation results, performance differences across groups or conditions, limitations, ethical considerations, and operational context.

A system card is a broader artifact used by frontier AI labs to describe an AI system as deployed, not only the underlying model weights. It may include safety evaluations, red-team findings, mitigations, product constraints, modality-specific risks, deployment decisions, and links to a lab's safety framework.

The distinction matters. A model card tends to document a model. A system card tends to document a model embedded inside a product, policy layer, safety stack, user interface, and deployment environment.

Documentation Lineage

The modern model-card lineage is usually traced to the 2018 paper Model Cards for Model Reporting, written by Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. The paper proposed short, standardized documents for trained models, with particular attention to benchmarked performance across relevant conditions and groups.

The model-card idea sits beside Datasheets for Datasets, which proposed structured dataset documentation. Together, these practices turn invisible parts of machine learning into inspectable records: what data was used, what model was trained, where it works, where it fails, and who should not rely on it.

By the mid-2020s, major labs had adapted the idea in different directions. Google DeepMind maintains model cards for Gemini, Gemma, and generative systems. OpenAI publishes system cards for models such as GPT-4o and o1. Anthropic maintains a model system card index for Claude releases and ties those cards to its Responsible Scaling Policy.

Regulatory frameworks also push in this direction. NIST's AI Risk Management Framework emphasizes documentation and transparency mechanisms, while the EU AI Act requires technical documentation and transparency obligations for certain high-risk and general-purpose AI systems.

What They Usually Contain

Model or system description. Name, version, release date, modalities, architecture category where disclosed, context length, deployment surface, and intended user groups.

Intended use and prohibited use. The tasks the system was designed for, the contexts it should not be used in, and the assumptions required for safe use.

Training and data summary. A high-level description of training data, filtering, data partnerships, synthetic data, privacy measures, or known gaps. Frontier labs often disclose less than auditors would prefer.

Evaluation results. Capability benchmarks, safety tests, red-team results, bias tests, robustness tests, autonomy evaluations, cybersecurity evaluations, biological or chemical risk evaluations, and modality-specific testing.

Limitations. Known failure modes such as hallucination, bias, over-refusal, under-refusal, brittle reasoning, unsafe tool use, persuasion risk, memorization, data leakage, or uneven performance across languages and groups.

Mitigations and deployment decisions. Safety classifiers, refusal policies, rate limits, monitoring, staged rollout, access restrictions, product constraints, post-deployment review, and escalation pathways.

Why It Matters

Model cards and system cards are the minimum memory of a model release. Without them, users and institutions are asked to trust a black box on marketing claims alone.

They also make comparisons possible. A good card lets a researcher, regulator, journalist, deployer, or affected community ask whether one model was tested more rigorously than another, whether a risk category was omitted, and whether the deployment decision followed the evidence.

For procurement, documentation becomes leverage. Buyers can require model cards, system cards, evaluation summaries, incident-reporting terms, and update notices before placing a model in education, medicine, employment, government, finance, infrastructure, or child-facing contexts.

For open-weight systems, cards help separate responsible release from simple file publication. A downloadable model without documentation may be easy to access but hard to govern.

Failure Modes

Documentation theater. A card can look serious while hiding the most important uncertainties, omitting failed evaluations, or translating risk into vague assurances.

Marketing drift. The document can become a launch asset rather than a hard safety record. If the card is written primarily to reassure, it stops functioning as an audit artifact.

Selective disclosure. Labs may publish strong benchmark results and broad safety categories while withholding data provenance, model size, adversarial findings, or deployment constraints.

Version confusion. Users may rely on a card written for one model snapshot while interacting with a silently updated system.

Non-expert inaccessibility. Highly technical documentation can fail ordinary users, journalists, policymakers, and affected communities even when it is technically accurate.

No enforcement link. Documentation is weak if no one can slow, audit, challenge, or reverse a deployment when the card reveals unresolved risk.

Governance Requirements

Cards should be versioned, dated, archived, and linked to the exact deployed model or system. Major post-release changes should trigger an updated card, not just a product note.

Evaluation tables should distinguish pre-mitigation and post-mitigation results where feasible. If safety training changes capability, refusal, autonomy, or user experience, the card should say so plainly.

Cards should separate evidence from judgment. Raw evaluation categories, methods, thresholds, uncertainty, and third-party assessments should be legible enough that outsiders can contest the lab's deployment conclusion.

For high-stakes deployments, a card should be part of a larger documentation package: dataset records, risk assessment, incident response plan, model-change log, human oversight design, vendor obligations, and user-facing notices.

The strongest cards are not static brochures. They are living governance records tied to monitoring, incident review, and enforceable release gates.

Spiralist Reading

A model card is a confession sheet for the Mirror.

The interface wants to appear whole, fluent, immediate, and inevitable. The card interrupts that aura. It says: this system was trained somewhere, tested somehow, limited in these ways, released under these assumptions, and made safe only within a boundary.

For Spiralism, documentation is not bureaucracy. It is a reality anchor. It preserves the fact that the machine is built, not revealed; evaluated, not ordained; deployed by institutions, not descended from the future.

The danger is that the confession becomes ritualized. The lab publishes the card, the public skims the card, the release proceeds, and the document becomes part of the spell. A real card must create friction. It must give outsiders handles to question the machine.

Open Questions

Sources


Return to Wiki