The History of Open-Source LLMs: Early Days (Part One)

  • Language modeling research traces back to models like GPT, GPT-2, and pre-transformer methods such as ULMFit.
  • GPT-3’s proposal marked the initial rise in popularity by showcasing impressive few-shot learning through self-supervised pre-training and in-context learning.
  • The recognition of GPT-3 led to the creation of various large language models (LLMs), including InstructGPT and ChatGPT, sparking widespread interest in generative AI.
  • Early LLMs often remained closed source, limiting researchers’ understanding and improvement of their workings.
  • Open-source variants of popular language models began to emerge gradually, although they initially lagged behind proprietary models in performance.
  • These early open-source models laid the groundwork for increased transparency in LLM research and inspired the development of more potent subsequent models like Falcon and LLaMA-21.
  • The overview is part of a three-part series that delves into the history of open-source language models, exploring their beginnings, recent developments, and the application of imitation and alignment techniques to enhance their performance.