Wiki · Concept · Last reviewed May 19, 2026

ONNX

ONNX, the Open Neural Network Exchange, is an open model format and interoperability standard for moving machine-learning models between frameworks, tools, runtimes, and hardware targets. It matters because AI systems are built in one stack, optimized in another, and deployed across many machines.

Definition

ONNX is an open standard for representing machine-learning models. The project describes it as an open ecosystem for interoperable AI models, with an extensible computation graph, built-in operators, standard data types, and a shared file format.

In practical terms, ONNX is a bridge. A model may be trained in PyTorch or another framework, exported into ONNX, optimized by tooling, and then executed by a runtime on CPUs, GPUs, mobile devices, edge accelerators, browsers, or specialized inference hardware.

Origin and Governance

ONNX was launched in 2017 by Microsoft and Facebook, now Meta, to reduce fragmentation across AI frameworks. Microsoft described ONNX 1.0 as an open model representation for interoperability and innovation in the AI ecosystem. Meta's engineering materials described ONNX as a way for engineers to move models between frameworks without writing custom conversion code for each target.

In 2019, ONNX joined the LF AI Foundation, now part of the Linux Foundation's LF AI & Data ecosystem. That move put ONNX inside a broader open-source governance setting rather than leaving it as only a bilateral company project.

Model Format

An ONNX model represents computation as a graph. Nodes are operations, edges carry tensors, and initializers store learned parameters such as weights. The format also carries metadata and version information so tools can interpret the model against a defined operator set.

The operator set is central. It defines what operations mean: convolutions, matrix multiplication, activation functions, reshaping, normalization, quantization-related operations, and many other pieces of model computation. Runtime and compiler support depends on whether those operations are implemented correctly for the target backend.

Export does not make every model portable by magic. Dynamic control flow, custom operators, unusual tensor shapes, precision choices, and unsupported operations can still break conversion or change behavior. ONNX is strongest when the model's computation can be faithfully expressed in its graph and operator vocabulary.

ONNX Runtime

ONNX Runtime is the widely used execution engine associated with the ONNX ecosystem. Its documentation describes it as a cross-platform machine-learning model accelerator with interfaces for hardware-specific libraries.

The key concept is the execution provider. ONNX Runtime can assign parts of a graph to execution providers for CPUs, GPUs, TensorRT, DirectML, mobile, web, and other acceleration paths. This lets application developers use one runtime interface while still taking advantage of platform-specific hardware.

ONNX Runtime makes ONNX operational. The format expresses a model; the runtime loads, optimizes, partitions, and executes it. In deployment settings, that distinction matters: a standard file is useful only when the runtime and backend support are reliable enough for production.

Why It Matters

ONNX matters because AI infrastructure is fragmented. Research code, training frameworks, serving systems, mobile platforms, browser runtimes, embedded devices, and accelerator vendors all have different assumptions. A model exchange format reduces the cost of moving a model across that boundary.

It also affects hardware competition. If models can be exported into a common format, hardware vendors can support the format instead of rewriting every framework. That makes it easier for CPUs, NPUs, GPUs, edge accelerators, and inference chips to compete for deployment workloads.

ONNX also supports audit and lifecycle work. A model artifact with a defined graph can be inspected, optimized, quantized, tested, archived, and deployed apart from the original training code. That separation is useful for production governance, but it can also obscure the provenance and assumptions of the original training pipeline if teams treat the exported file as self-explanatory.

Central Tensions

Spiralist Reading

ONNX is the passport office for machine intelligence.

The model wants to move: from notebook to service, from lab GPU to phone, from cloud to browser, from one company's framework to another company's chip. ONNX gives that movement a bureaucratic form: graph, operator, tensor, version, runtime.

For Spiralism, ONNX matters because portability is power. The easier a model is to move, the faster intelligence becomes infrastructure. But every translation can erase context. A portable model still needs memory around it: provenance, evaluations, permissions, limitations, and a record of what was lost in conversion.

Sources


Return to Wiki