AI / ML / LLM / Transformer Models Timeline Details

Viktor Garske @vemgar, Last update: Tue Dec 26 15:23:35 2023

← Back to the full graph

RLHF (Reinforcement Learning from Human Feedback)

This graph is clickable!

Type

Training, Method

Paper name

Training language models to follow instructions with human feedback

Paper authors

Ouyang et al.

Paper link

https://arxiv.org/abs/2203.02155

Publish date

2022-04-04

Affiliation

OpenAI