The History of Open-Source LLMs: Better Base Models (Part Two)

https://cameronrwolfe.substack.com/p/the-history-of-open-source-llms-better

  • Value of Open-source LLM Research: Aims to democratize influential technology; despite initial struggles and criticism, open-source LLMs gained popularity and significance.
  • Early Challenges: Initial open-source LLMs performed poorly and faced criticism, posing difficulties for advancement.
  • Transformative Research Line: Focuses on enhancing open-source LLMs, leading to high-performing pre-trained models accessible to all.
  • Significance of High-Performing Models: Creation of powerful, cost-effective pre-trained LLMs revolutionized research accessibility.
  • Series Overview: Part two of a three-part series on open-source LLM history. The first part explored initial open-source LLM attempts.
  • Study Focus: This overview delves into the most popular open-source base models, emphasizing pre-trained models not yet fine-tuned or aligned.
  • Future Exploration: Subsequent installment will discuss fine-tuning and alignment of models for diverse practical applications.

Practical Prompt Engineering

https://cameronrwolfe.substack.com/p/practical-prompt-engineering-part

  • Prompt engineering: An empirical science focused on optimizing LLM (Large Language Model) performance through various prompting strategies.
  • Aims to understand prompting mechanics and employs techniques to enhance LLM capabilities.
  • Zero/few-shot learning: A fundamental technique where LLMs perform tasks with minimal or no training examples, showcasing their remarkable adaptability.
  • Instruction prompting: Another vital technique involving explicit instructions in prompts to guide LLM behavior.
  • Overview intends to impart practical insights and strategies for effective prompt engineering and LLM utilization.
  • Provides actionable tricks and takeaways for prompt engineers and LLM practitioners to enhance their effectiveness.

The History of Open-Source LLMs: Early Days (Part One)

https://cameronrwolfe.substack.com/p/the-history-of-open-source-llms-early

  • Language modeling research traces back to models like GPT, GPT-2, and pre-transformer methods such as ULMFit.
  • GPT-3’s proposal marked the initial rise in popularity by showcasing impressive few-shot learning through self-supervised pre-training and in-context learning.
  • The recognition of GPT-3 led to the creation of various large language models (LLMs), including InstructGPT and ChatGPT, sparking widespread interest in generative AI.
  • Early LLMs often remained closed source, limiting researchers’ understanding and improvement of their workings.
  • Open-source variants of popular language models began to emerge gradually, although they initially lagged behind proprietary models in performance.
  • These early open-source models laid the groundwork for increased transparency in LLM research and inspired the development of more potent subsequent models like Falcon and LLaMA-21.
  • The overview is part of a three-part series that delves into the history of open-source language models, exploring their beginnings, recent developments, and the application of imitation and alignment techniques to enhance their performance.

Yellowbrick: Machine Learning Visualization

https://www.scikit-yb.org/en/latest/

Feature Visualization

Classification Visualization

Regression Visualization

Clustering Visualization

Model Selection Visualization

Target Visualization

  • Balanced Binning Reference: generate a histogram with vertical lines showing the recommended value point to bin the data into evenly distributed bins
  • Class Balance: see how the distribution of classes affects the model
  • Feature Correlation: display the correlation between features and dependent variables

Text Visualization

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

The main idea in the paper is that the performance of regular Multi-layer Perceptron (MLP) can be significantly improved if we use Transformers to transforms regular categorical embeddings into contextual ones.

The TabTransformer is built upon self-attention based Transformers. The Transformer layers transform the embed- dings of categorical features into robust contextual embed- dings to achieve higher prediction accuracy.