GPT in 60 Lines of NumPy

In this post, they implement a GPT from scratch in just 60 lines of numpy. We’ll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text.


  • This post assumes familiarity with Python, NumPy, and some basic experience training neural networks.
  • This implementation is missing tons of features on purpose to keep it as simple as possible while remaining complete. The goal is to provide a simple yet complete technical introduction to the GPT as an educational tool.
  • The GPT architecture is just one small part of what makes LLMs what they are today.[1].
  • All the code for this blog post can be found at
  • Hacker news thread
  • Chinese translation

Text splitting

Large language models (LLMs) can be used for many tasks, but often have a limited context size that can be smaller than documents you might want to use. To use documents of larger length, you often have to split your text into chunks to fit within this context size.

This crate provides methods for splitting longer pieces of text into smaller chunks, aiming to maximize a desired chunk size, but still splitting at semantically sensible boundaries whenever possible.

Levels Of Text Splitting

Semantic text splitting library

Chunks Vizualizer

Renumics Spotlight

Spotlight helps you to understand unstructured datasets fast. You can create interactive visualizations from your dataframe with just a few lines of code. You can also leverage data enrichments (e.g. embeddings, prediction, uncertainties) to identify critical clusters in your data.

Revolutionizing AI Reading Comprehension: ReadAgent’s Breakthrough in Handling Documents with 20 Million Tokens

  • Introduction to ReadAgent by Google DeepMind
  • Development of ReadAgent, an AI capable of understanding long texts beyond the limits of its language model.
  • Utilizes a human-like reading strategy to comprehend complex documents.
  • Challenges Faced by Language Models
  • Context length limitation: Fixed token processing capacity leading to performance decline.
  • Ineffective context usage: Decreased comprehension with increasing text length.
  • Features of ReadAgent
  • Mimics human reading by forming and using « gist memories » of texts.
  • Breaks down texts into smaller « episodes » and generates gist memories for each.
  • Looks up relevant episodes when needed for answering questions.
  • Performance Enhancements
  • Capable of understanding documents « 20 times longer » than its base language model.
  • Shows improved performance on long document question answering datasets:
    • QuALITY: Accuracy improved from 85.8% to 86.9%.
    • NarrativeQA: Rating increased by 13-32% over baselines.
    • QMSum: Rating improved from 44.96% to 49.58%.
  • Potential Applications
  • Legal contract review, scientific literature analysis, customer support, financial report summarization, automated online course creation.
  • Indicates the future potential of AI in mastering lengthy real-world documents through human-like reading strategies.

Publié dans LLM | Marqué avec

DoRA: Weight-Decomposed Low-Rank Adaptation

  • Objective Exploration: Investigates the disparities between full fine-tuning (FT) and LoRA through a novel weight decomposition analysis.
  • Innovative Method: Introduces Weight-Decomposed LowRank Adaptation (DoRA), which splits pre-trained weights into magnitude and direction for fine-tuning.
  • Strategic Approach: Employs LoRA for directional updates, significantly reducing the number of trainable parameters.
  • Enhanced Performance: By adopting DoRA, it improves learning capacity and training stability of LoRA, without extra inference costs.
  • Proven Superiority: Demonstrates that DoRA outperforms LoRA in fine-tuning LLAMA, LLaVA, and VL-BART on tasks like commonsense reasoning, visual instruction tuning, and image/video-text understanding.


Bunkatopics is a package designed for Data Cleaning, Topic Modeling Visualization and Frame Analysis. Its primary goal is to assist developers in gaining insights from unstructured data, potentially facilitating data cleaning and optimizing LLMs through fine-tuning processes. Bunkatopics is constructed using well-known libraries like langchain, chroma, and transformers, enabling seamless integration into various environments.


Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

LoRAX (LoRA eXchange) is a framework that allows users to serve thousands of fine-tuned models on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency.

Publié dans LLM

LiPO: Listwise Preference Optimization through Learning-to-Rank

  • Innovative Framework: LiPO revolutionizes language model alignment by approaching it as a listwise ranking challenge.
  • Cutting-Edge Techniques: Utilizes advanced LTR algorithms for a more refined optimization process.
  • Superior Performance: LiPO-X method surpasses traditional methods in aligning models with human preferences.

Enhanced Learning Efficiency: Offers a more effective learning paradigm from ranked response lists.

  • Scalable Solution: Shows promise for scaling up to larger language model policies across various applications

PyOD, a versatile Python library for detecting anomalies in multivariate data.

Whether you’re tackling a small-scale project or large datasets, PyOD offers a range of algorithms to suit your needs.