AI / ML / LLM / Transformer Models Timeline Details

Viktor Garske @vemgar, Last update: Tue Dec 26 15:23:35 2023
← Back to the full graph

Megatron-LM

This graph is clickable!

timeline 06/2017 06/2017 09/2019 09/2019 06/2017->09/2019 06/2021 06/2021 09/2019->06/2021 08/2021 08/2021 06/2021->08/2021 05/2023 05/2023 08/2021->05/2023 MegatronLm Megatron-LM GptNeox GPT-NeoX MegatronLm->GptNeox MeshTransformerJax Mesh Transformer JAX MegatronLm->MeshTransformerJax Starcoderbase StarCoderBase MegatronLm->Starcoderbase Attention Attention / Transformers Attention->MegatronLm
Type
Model, Architecture
Paper name
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Paper authors
Shoeybi et al.
Paper link
https://arxiv.org/abs/1909.08053
Publish date
2019-09-17
Repository link
https://github.com/NVIDIA/Megatron-LM
Affiliation
NVIDIA