Kubnal Bridge

Techniques & Methods

Upstream Sampling

Upstream sampling (also called best-of-N sampling) involves generating multiple independent model completions for the same prompt, then selecting the highest-scoring output according to a reward model or evaluation function. This trades compute for quality.

It is used in RLHF pipelines and inference-time scaling strategies. Rather than improving the model's weights, upstream sampling improves output quality at inference time by exploring the model's output distribution.

Authority Links

Related Terms