The History of Open-Source LLMs: Early Days (Part One)

Language modeling research traces back to models like GPT, GPT-2, and pre-transformer methods such as ULMFit.
GPT-3’s proposal marked the initial rise in popularity by showcasing impressive few-shot learning through self-supervised pre-training and in-context learning.
The recognition of GPT-3 led to the creation of various large language models (LLMs), including InstructGPT and ChatGPT, sparking widespread interest in generative AI.
Early LLMs often remained closed source, limiting researchers’ understanding and improvement of their workings.
Open-source variants of popular language models began to emerge gradually, although they initially lagged behind proprietary models in performance.
These early open-source models laid the groundwork for increased transparency in LLM research and inspired the development of more potent subsequent models like Falcon and LLaMA-21.
The overview is part of a three-part series that delves into the history of open-source language models, exploring their beginnings, recent developments, and the application of imitation and alignment techniques to enhance their performance.

Deeplearning.fr