Constrained decoding

Also called structured decoding, constrained decoding is a technique for restricting what a language model is allowed to produce. Instead of letting the model write any text, the system requires the output to fit a predefined structure (a fixed set of fields, a list of allowed values, or a strict format) and the model can only fill in the blanks within those limits.

This is useful when free-form generation is risky. A general-purpose model asked to translate will happily invent a fluent-sounding answer whether or not it is correct (see Ojibwe Chat). Constraining the model instead means it can only choose among options that someone has vetted in advance (see Yaduha).

Created · Updated
Supported By the National Science Foundation Award 2542375.