Scalable. Interactive. Interpretable Data Science

Safe, interpretable, trustworthy AI, through interactive intelligent visualization, with applications in adversarial machine learning (protecting AI from harm and doing harm), scalable discoveries of deep learning models, and inclusive AI for everyone.

A Must.

https://poloclub.github.io/

Transformer Explainer

Learn How Transformer Models Work with Interactive Visualization

https://poloclub.github.io/transformer-explainer/

    Diffusion Explainer

    Learn how Stable Diffusion transforms your text prompt into image.

    https://poloclub.github.io/diffusion-explainer/

    User-Centric RAG

    Transforming RAG with LlamaIndex Multi-Agent System and Qdrant

    Retrieval-Augmented Generation (RAG) models have evolved significantly over time. Initially, traditional RAG systems faced numerous limitations. However, with advancements in the field, we have seen the emergence of more sophisticated RAG applications. Techniques such as Self-RAG, Hybrid Search RAG, experimenting with different prompting and chunking strategies, and the evolution of Agentic RAG have addressed many of the initial limitations.

    https://medium.com/@pavannagula76/user-centric-rag-transforming-rag-with-llamaindex-multi-agent-system-and-qdrant-cf3c32cfe6f3

    PDF-Extract-Kit

    PDF-Extract-Kit, a comprehensive toolkit for high-quality PDF content extraction, including layout detectionformula detectionformula recognition, and OCR.

    PDF documents contain a wealth of knowledge, yet extracting high-quality content from PDFs is not an easy task. To address this, we have broken down the task of PDF content extraction into several components:

    • Layout Detection: Using the LayoutLMv3model for region detection, such as imagestablestitlestext, etc.;
    • Formula Detection: Using YOLOv8 for detecting formulas, including inline formulas and isolated formulas;
    • Formula Recognition: Using UniMERNet for formula recognition;
    • Table Recognition: Using StructEqTable for table recognition;
    • Optical Character Recognition: Using PaddleOCR for text recognition;

    https://github.com/opendatalab/PDF-Extract-Kit

    https://www.perplexity.ai/search/look-at-this-github-https-gith-8ZVtYO.2SA6_q5Vg.VXy.g

    Self-RAG

    Self-RAG is another form of Retrieval Augmented Generation (RAG). Unlike other RAG retrieval strategies, it doesn’t enhance a specific module within the RAG process. Instead, it optimizes various modules within the RAG framework to improve the overall RAG process. If you’re unfamiliar with Self-RAG or have only heard its name, join me today to understand the implementation principles of Self-RAG and better grasp its details through code.

    https://ai.gopubby.com/advanced-rag-retrieval-strategies-self-rag-3e9a4cd422a1

    https://llamahub.ai/l/llama-packs/llama-index-packs-self-rag?from=

    Rag techniques notebook

    This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and contextually rich responses.

    https://github.com/NirDiamant/RAG_Techniques

    https://github.com/NirDiamant/RAG_Techniques/tree/main/all_rag_techniques

    https://github.com/NirDiamant/RAG_Techniques/tree/main/all_rag_techniques

    Perplexica – An AI-powered search engine 

    Perplexica is an open-source AI-powered searching tool or an AI-powered search engine that goes deep into the internet to find answers. Inspired by Perplexity AI, it’s an open-source option that not just searches the web but understands your questions. It uses advanced machine learning algorithms like similarity searching and embeddings to refine results and provides clear answers with sources cited.

    Using SearxNG to stay current and fully open source, Perplexica ensures you always get the most up-to-date information without compromising your privacy.

    https://github.com/ItzCrazyKns/Perplexica