Open Source LLM Tools
If you are looking for useful open-source LLM tools, this is a really useful resource.
It includes different categories like tutorials, AI engineering, and applications, among others. You can also see the # of GitHub stars.
Open Source LLM Tools
If you are looking for useful open-source LLM tools, this is a really useful resource.
It includes different categories like tutorials, AI engineering, and applications, among others. You can also see the # of GitHub stars.
Transforming RAG with LlamaIndex Multi-Agent System and Qdrant
Retrieval-Augmented Generation (RAG) models have evolved significantly over time. Initially, traditional RAG systems faced numerous limitations. However, with advancements in the field, we have seen the emergence of more sophisticated RAG applications. Techniques such as Self-RAG, Hybrid Search RAG, experimenting with different prompting and chunking strategies, and the evolution of Agentic RAG have addressed many of the initial limitations.
High-efficiency production-scale entity relationship extraction
Outperforming Claude 3.5 Sonnet with Phi-3-mini-4k for graph entity relationship extraction tasks
https://huggingface.co/spaces/EmergentMethods/Phi-3-mini-instruct-graph
PDF-Extract-Kit
, a comprehensive toolkit for high-quality PDF content extraction, including layout detection
, formula detection
, formula recognition
, and OCR
.
PDF documents contain a wealth of knowledge, yet extracting high-quality content from PDFs is not an easy task. To address this, we have broken down the task of PDF content extraction into several components:
images
, tables
, titles
, text
, etc.;inline formulas
and isolated formulas
;https://github.com/opendatalab/PDF-Extract-Kit
https://www.perplexity.ai/search/look-at-this-github-https-gith-8ZVtYO.2SA6_q5Vg.VXy.g
BERTopic generates document embedding with pre-trained transformer-based language models, clusters these embeddings, and finally, generates topic representations with the class-based TF-IDF procedure.
https://ritvik19.medium.com/papers-explained-193-bertopic-f9aec10cd5a6
Self-RAG is another form of Retrieval Augmented Generation (RAG). Unlike other RAG retrieval strategies, it doesn’t enhance a specific module within the RAG process. Instead, it optimizes various modules within the RAG framework to improve the overall RAG process. If you’re unfamiliar with Self-RAG or have only heard its name, join me today to understand the implementation principles of Self-RAG and better grasp its details through code.
https://ai.gopubby.com/advanced-rag-retrieval-strategies-self-rag-3e9a4cd422a1
https://llamahub.ai/l/llama-packs/llama-index-packs-self-rag?from=
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.
https://github.com/NirDiamant/RAG_Techniques
https://github.com/NirDiamant/RAG_Techniques/tree/main/all_rag_techniques
Perplexica is an open-source AI-powered searching tool or an AI-powered search engine that goes deep into the internet to find answers. Inspired by Perplexity AI, it’s an open-source option that not just searches the web but understands your questions. It uses advanced machine learning algorithms like similarity searching and embeddings to refine results and provides clear answers with sources cited.
Using SearxNG to stay current and fully open source, Perplexica ensures you always get the most up-to-date information without compromising your privacy.
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation
RAG Foundry is a library designed to improve LLMs ability to use external information by fine-tuning models on specially created RAG-augmented datasets. The library helps create the data for training, given a RAG technique, helps easily train models using parameter-efficient finetuning (PEFT), and finally can help users measure the improved performance using various, RAG-specific metrics. The library is modular, workflows are customizable using configuration files.