Gloqo AI
Visual Labs
Model Mechanics
Softmax

Explore how logits become a probability distribution.

σi(z)=ezij=1Kezj

where z are the input scores and i is the class index

Logits to Probability
Adjust raw class scores and watch the distribution move.
raw scoreprobabilityClass 14.084.4%Class 22.011.4%Class 31.04.2%blue: scoregreen: probability
Controls
Class 1exp(4.00) = 54.60
Score 4.00 · P 84.4%
Class 2exp(2.00) = 7.39
Score 2.00 · P 11.4%
Class 3exp(1.00) = 2.72
Score 1.00 · P 4.2%
Calculation
denominator = sum_j exp(z_j) = 64.705P(class 1) = exp(4.00) / 64.705 = 54.598 / 64.705 = 0.844P(class 2) = exp(2.00) / 64.705 = 7.389 / 64.705 = 0.114P(class 3) = exp(1.00) / 64.705 = 2.718 / 64.705 = 0.042sum P = 100.0%
Technical Notes
Notes carried over from the original visual, tuned for the new site.

Key Properties of Softmax

The softmax function converts raw scores into a probability distribution, ensuring all outputs are between 0 and 1 and sum to 1.

  • All output probabilities are guaranteed to be positive because each score is exponentiated.
  • The sum of all probabilities always equals 1, creating a valid distribution over the classes.

Score Sensitivity

Larger differences between input scores result in more extreme probability distributions. This lets a model express strong preferences when there are clear distinctions between classes.

Historical Terminology

The precise term is soft(arg)max, but softmax became the standard shorthand in machine learning frameworks and literature because it acts as a differentiable approximation of argmax.