You Only Train Once : Loss-conditional training of deep networks

Alexey Dosovitskiy, Josip Djolonga, ICLR 2020


The paper proposes a simple and broadly applicable approach that efficiently deals with multi-term loss functions and, more generally, arbitrarily parameterized loss functions. It suggests a method to train a single model simultaneously minimizing a family of loss functions instead of training a set of per-loss models.

Concept behind the paper

The main concept is to train a single model that covers all choices of coefficients of the loss terms, instead of training a model for each set of coefficients. This is achieved by:

This way, at inference time the conditioning vector can be varied, allowing us to traverse the space of models corresponding to loss functions with different coefficients.


This training procedure is illustrated in the diagram below for the style transfer task:

Main contributions

Our two cents