If you want to get started with Llama, this is the definitive place. Just some of the areas covered:
Fine Tuning
Quantization
Prompting
Inferencing
Validation
Integration Guides
Code Llama
Integration with LangChain
Integration with LlamaIndex
If you want to get started with Llama, this is the definitive place. Just some of the areas covered:
Fine Tuning
Quantization
Prompting
Inferencing
Validation
Integration Guides
Code Llama
Integration with LangChain
Integration with LlamaIndex
In this article, they explore dimensionality reduction, a valuable tool for machine learning practitioners aiming to analyze vast, high-dimensional datasets. While t-SNE is a commonly used technique for visualization, its efficacy diminishes with large datasets and mastering its application can be challenging.
UMAP, introduced by McInnes et al., presents several advantages over t-SNE, including enhanced speed and better preservation of a dataset’s global structure. This article delves into the theory behind UMAP, providing insights into its functionality, effective usage, and a performance comparison with t-SNE.
Summary:
https://docs.llamaindex.ai/en/stable/examples/agent/multi_document_agents.html
State-of-the-art large language models (LLMs) are pre-trained with billions of parameters. While pre-trained LLMs can perform many tasks, they can become much better once fine-tuned.
Thanks to LoRA, fine-tuning costs can be dramatically reduced. LoRA adds low-rank tensors, i.e., a small number of parameters (millions), on top of the frozen original parameters. Only the parameters in the added tensors are trained during fine-tuning.
LoRA still requires the model to be loaded in memory. To reduce the memory cost and speed-up fine-tuning, a new approach proposes quantization-aware LoRA (QA-LoRA) fine-tuning.
In this article, I explain QA-LoRA and review its performance compared with previous work (especially QLoRA). I also show how to use QA-LoRA to fine-tune your own quantization-aware LoRA for Llama 2.
Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models
Meta-CoT is a generalizable CoT prompting method in mixed-task scenarios where the type of input questions is unknown. It consists of three phases: (i) scenario identification: categorizes the scenario of the input question; (ii) demonstration selection: fetches the ICL demonstrations for the categorized scenario; (iii) answer derivation: performs the answer inference by feeding the LLM with the prompt comprising the fetched ICL demonstrations and the input question
Nixtla’s TimeGPT is a generative pre-trained forecasting model for time series data. TimeGPT can produce accurate forecasts for new time series without training, using only historical values as inputs. TimeGPT can be used across a plethora of tasks including demand forecasting, anomaly detection, financial forecasting, and more.
The TimeGPT model “reads” time series data much like the way humans read a sentence – from left to right. It looks at windows of past data, which we can think of as “tokens”, and predicts what comes next. This prediction is based on patterns the model identifies in past data and extrapolates into the future.
The API provides an interface to TimeGPT, allowing users to leverage its forecasting capabilities to predict future events. TimeGPT can also be used for other time series-related tasks, such as what-if scenarios, anomaly detection, and more.
https://nixtla.github.io/nixtla/docs/getting-started/getting_started_short.html
https://youtube.com/watch?v=POBcYr0sbcg&si=VD5odGEisjt2fzUi
DSPy is the framework for solving advanced tasks with language models (LMs) and retrieval models (RMs). DSPy unifies techniques for prompting and fine-tuning LMs — and approaches for reasoning, self-improvement, and augmentation with retrieval and tools. All of these are expressed through modules that compose and learn.
To make this possible:
The DSPy compiler bootstraps prompts and finetunes from minimal data without needing manual labels for the intermediate steps in your program. Instead of brittle « prompt engineering » with hacky string manipulation, you can explore a systematic space of modular and trainable pieces.
For complex tasks, DSPy can routinely teach powerful models like GPT-3.5
and local models like T5-base
or Llama2-13b
to be much more reliable at tasks. DSPy will compile the same programinto different few-shot prompts and/or finetunes for each LM.
If you want to see DSPy in action, open our intro tutorial notebook.
https://amatriain.net/blog/hallucinations#advancedprompting
Ever curious about the challenges of embedding large language models in products? A notable issue is ‘hallucinations’ where AI outputs misleading data. This blog offers a guide on tackling these issues in user-facing products, giving a snapshot of current best practices.