AI / ML / LLM / Transformer Models Timeline Details

Viktor Garske @vemgar, Last update: Sun Apr 30 23:03:44 2023
← Back to the full graph

RHLF (Reinforcement Learning from Human Feedback)

This graph is clickable!

timeline 01/2022 01/2022 04/2022 04/2022 01/2022->04/2022 04/2023 04/2023 04/2022->04/2023 Instructgpt InstructGPT Stablevicuna StableVicuna Rhlf RHLF Rhlf->Instructgpt Rhlf->Stablevicuna
Type
Training, Method
Paper name
Training language models to follow instructions with human feedback
Paper authors
Ouyang et al.
Paper link
https://arxiv.org/abs/2203.02155
Publish date
2022-04-04
Affiliation
OpenAI