Complete guide on llama

Publié le 28 octobre 2023 par loic

If you want to get started with Llama, this is the definitive place. Just some of the areas covered:

Fine Tuning
Quantization
Prompting
Inferencing
Validation
Integration Guides
Code Llama
Integration with LangChain
Integration with LlamaIndex

https://ai.meta.com/llama/get-started/

Visually Understanding UMAP

Publié le 25 octobre 2023 par loic

In this article, they explore dimensionality reduction, a valuable tool for machine learning practitioners aiming to analyze vast, high-dimensional datasets. While t-SNE is a commonly used technique for visualization, its efficacy diminishes with large datasets and mastering its application can be challenging.

UMAP, introduced by McInnes et al., presents several advantages over t-SNE, including enhanced speed and better preservation of a dataset’s global structure. This article delves into the theory behind UMAP, providing insights into its functionality, effective usage, and a performance comparison with t-SNE.

https://pair-code.github.io/understanding-umap/

Challenges of NLP in Dealing with Structured Documents: The Case of PDFs

Publié le 22 octobre 2023 par loic

Summary:

NLP’s expanding real-world applications face a hurdle.
Most NLP tasks assume clean, raw text data.
In practice, many documents, especially legal ones, are visually structured, like PDFs.
Visual Structured Documents (VSDs) pose challenges for content extraction.
The discussion primarily focuses on text-only layered PDFs.
These PDFs, although considered resolved, still present NLP challenges.

https://blog.llamaindex.ai/mastering-pdfs-extracting-sections-headings-paragraphs-and-tables-with-cutting-edge-parser-faea18870125

RAG: Multi-Document Agents

Publié le 22 octobre 2023 par loic

Multi-Document Agents guide explains how to set up an agent that can answer different types of questions over a larger set of documents.
The questions include QA over a specific doc, QA comparing different docs, summaries over a specific doc, and comparing summaries between different docs.
The architecture involves setting up a « document agent » over each document, which can do QA/summarization within its document, and a top-level agent over this set of document agents, which can do tool retrieval and then do CoT over the set of tools to answer a question.
The guide provides code examples using the LlamaIndex and OpenAI libraries.
The document agent can dynamically choose to perform semantic search or summarization within a given document.
A separate document agent is created for each city.
The top-level agent can orchestrate across the different document agents to answer any user query.

https://docs.llamaindex.ai/en/stable/examples/agent/multi_document_agents.html

QA-LoRA: Fine-Tune a Quantized Large Language Model on Your GPU

Publié le 14 octobre 2023 par loic

State-of-the-art large language models (LLMs) are pre-trained with billions of parameters. While pre-trained LLMs can perform many tasks, they can become much better once fine-tuned.

Thanks to LoRA, fine-tuning costs can be dramatically reduced. LoRA adds low-rank tensors, i.e., a small number of parameters (millions), on top of the frozen original parameters. Only the parameters in the added tensors are trained during fine-tuning.

LoRA still requires the model to be loaded in memory. To reduce the memory cost and speed-up fine-tuning, a new approach proposes quantization-aware LoRA (QA-LoRA) fine-tuning.

In this article, I explain QA-LoRA and review its performance compared with previous work (especially QLoRA). I also show how to use QA-LoRA to fine-tune your own quantization-aware LoRA for Llama 2.

https://towardsdatascience.com/qa-lora-fine-tune-a-quantized-large-language-model-on-your-gpu-c7291866706c

Meta COT prompting

Publié le 14 octobre 2023 par loic

Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models

Meta-CoT is a generalizable CoT prompting method in mixed-task scenarios where the type of input questions is unknown. It consists of three phases: (i) scenario identification: categorizes the scenario of the input question; (ii) demonstration selection: fetches the ICL demonstrations for the categorized scenario; (iii) answer derivation: performs the answer inference by feeding the LLM with the prompt comprising the fetched ICL demonstrations and the input question

https://arxiv.org/abs/2310.06692

https://github.com/Anni-Zou/Meta-CoT

Time series Forcast TimeGPT

Publié le 13 octobre 2023 par loic

Nixtla’s TimeGPT is a generative pre-trained forecasting model for time series data. TimeGPT can produce accurate forecasts for new time series without training, using only historical values as inputs. TimeGPT can be used across a plethora of tasks including demand forecasting, anomaly detection, financial forecasting, and more.

The TimeGPT model “reads” time series data much like the way humans read a sentence – from left to right. It looks at windows of past data, which we can think of as “tokens”, and predicts what comes next. This prediction is based on patterns the model identifies in past data and extrapolates into the future.

The API provides an interface to TimeGPT, allowing users to leverage its forecasting capabilities to predict future events. TimeGPT can also be used for other time series-related tasks, such as what-if scenarios, anomaly detection, and more.

https://nixtla.github.io/nixtla/docs/getting-started/getting_started_short.html

DSPy from Stanfordnlp

Publié le 13 octobre 2023 par loic

https://youtube.com/watch?v=POBcYr0sbcg&si=VD5odGEisjt2fzUi

DSPy is the framework for solving advanced tasks with language models (LMs) and retrieval models (RMs). DSPy unifies techniques for prompting and fine-tuning LMs — and approaches for reasoning, self-improvement, and augmentation with retrieval and tools. All of these are expressed through modules that compose and learn.

To make this possible:

DSPy provides composable and declarative modules for instructing LMs in a familiar Pythonic syntax. It upgrades « prompting techniques » like chain-of-thought and self-reflection from hand-adapted string manipulation tricks into truly modular generalized operations that learn to adapt to your task.
DSPy introduces an automatic compiler that teaches LMs how to conduct the declarative steps in your program. Specifically, the DSPy compiler will internally trace your program and then craft high-quality prompts for large LMs (or train automatic finetunes for small LMs) to teach them the steps of your task.

The DSPy compiler bootstraps prompts and finetunes from minimal data without needing manual labels for the intermediate steps in your program. Instead of brittle « prompt engineering » with hacky string manipulation, you can explore a systematic space of modular and trainable pieces.

For complex tasks, DSPy can routinely teach powerful models like GPT-3.5 and local models like T5-base or Llama2-13b to be much more reliable at tasks. DSPy will compile the same programinto different few-shot prompts and/or finetunes for each LM.

If you want to see DSPy in action, open our intro tutorial notebook.

Mitigating LLM Hallucinations: a multifaceted approach

Publié le 2 octobre 2023 par loic

https://amatriain.net/blog/hallucinations#advancedprompting

Ever curious about the challenges of embedding large language models in products? A notable issue is ‘hallucinations’ where AI outputs misleading data. This blog offers a guide on tackling these issues in user-facing products, giving a snapshot of current best practices.

Deeplearning.fr

You have to learn the rules of the game. And then you have to play better than anyone else

Archives mensuelles : octobre 2023