AI / ML / LLM / Transformer Models Timeline Details
Viktor Garske
@vemgar
, Last update: Tue Dec 26 15:23:35 2023
← Back to the full graph
DPO (Direct Preference Optimization)
This graph is clickable!
timeline
04/2022
04/2022
05/2023
05/2023
04/2022->05/2023
Rlhf
RLHF
Dpo
DPO
Rlhf->Dpo
Type
Method
Paper name
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper authors
Rafailov et al.
Paper link
https://arxiv.org/abs/2305.18290
Publish date
2023-05-29
Affiliation
Stanford