AI / ML / LLM / Transformer Models Timeline Details

Viktor Garske @vemgar, Last update: Sun Apr 30 23:03:44 2023

← Back to the full graph

RHLF (Reinforcement Learning from Human Feedback)

This graph is clickable!

Type

Training, Method

Paper name

Training language models to follow instructions with human feedback

Paper authors

Ouyang et al.

Paper link

https://arxiv.org/abs/2203.02155

Publish date

2022-04-04

Affiliation

OpenAI