Wiki · Concept · Last reviewed May 19, 2026

BYOL

BYOL, short for Bootstrap Your Own Latent, is a non-contrastive self-supervised method for learning representations without explicit negative examples.

Definition

BYOL is a self-supervised learning method introduced by DeepMind in 2020. It trains an online network to predict the representation produced by a target network for another augmented view of the same image.

Unlike contrastive methods such as SimCLR, BYOL does not require explicit negative examples. That made it important in the history of non-contrastive representation learning.

Mechanism

BYOL uses two networks. The online network learns directly. The target network is updated more slowly, often as an exponential moving average of the online network. Different augmented views of the same image are fed to the two branches, and the online branch learns to predict the target branch's representation.

The asymmetry between online and target networks, plus the predictor head, helps the system avoid trivial collapse in practice.

Representation Collapse

The central puzzle of BYOL is why it does not simply map every image to the same representation. Earlier intuition suggested negative examples were needed to prevent collapse. BYOL showed that carefully designed non-contrastive systems could learn useful features without them.

This made later methods such as Barlow Twins and VICReg easier to interpret as part of a broader search for collapse-resistant self-supervised objectives.

Why It Matters

BYOL matters because it helped break the assumption that contrastive learning requires large sets of negatives. It also shaped the path toward representation learners that can train from abundant unlabeled data with less dependence on handcrafted labels.

For world-model research, BYOL is part of the technical prehistory: learning useful latent spaces is a prerequisite for predicting, planning, and acting beyond text.

Sources


Return to Wiki