Techniques & Methods

Overuse Penalty

Overuse penalties (also called repetition penalties) reduce the probability of tokens that have already appeared frequently in the generated output, preventing the model from looping or producing monotonous text. This is applied during decoding at inference time.

Repetition penalties are a standard parameter in LLM inference APIs. Setting them too high reduces repetition but can cause incoherence; setting them too low allows degenerate repetitive outputs.

Authority Links

Text Repetition in NLG — arXiv

Research on preventing repetition in neural text generation.

Hugging Face — Generation Params

Documentation on repetition penalty and generation parameters.

Related Terms

Techniques & Methods

Sequence Generation

Process where models produce sequences—such as words or tokens—based on learned patterns.

Techniques & Methods

Decoding Rules

Guidelines and algorithms that control how language models translate internal representations into output tokens.

Techniques & Methods

Beam Search

Search algorithm that maintains multiple candidate sequences to find high-quality generated outputs.

Techniques & Methods

Generation

Producing new text, code, or content based on learned patterns and a given input prompt.

Part-of-Speech Tagging (POS)Online Learning