After choosing our distance measure we need to pick a loss function. I’ll only talk about one option here, the classic triplet loss, but there is an equally dizzying array of loss functions from surrogate classification tasks to CLIP-style contrastive loss (some examples can be found here).
The triplet loss function simply samples three examples from the dataset, an “anchor” example as a basis for comparison, a “positive” example of the same class, and a “negative” example of a different one. The loss is then simply