Kubnal Bridge

Applications

InstructGPT

InstructGPT, developed by OpenAI and described in a 2022 paper, demonstrated that fine-tuning GPT-3 with RLHF dramatically improved its ability to follow diverse instructions, reduce harmful outputs, and produce honest responses—even with a smaller 1.3B parameter model outperforming the 175B GPT-3 base model on user preference.

InstructGPT established the SFT + RLHF training paradigm that has become standard for aligning LLMs. It directly preceded ChatGPT and influenced alignment approaches at Anthropic (Constitutional AI) and Google DeepMind.

Authority Links

Related Terms