Why use a RAG ?

Increasingly more business are leveraging AI to augment their organizations and large language models (LLMs) are behind what’s powering these incredible opportunities.

However the process of optimizing LLMs with methods like retrieval augmented generation (RAG) can be complex, which is why we’ll be walking you through everything you should consider before you get started.


Publié dans RAG | Marqué avec

Challenges of NLP in Dealing with Structured Documents: The Case of PDFs


  • NLP’s expanding real-world applications face a hurdle.
  • Most NLP tasks assume clean, raw text data.
  • In practice, many documents, especially legal ones, are visually structured, like PDFs.
  • Visual Structured Documents (VSDs) pose challenges for content extraction.
  • The discussion primarily focuses on text-only layered PDFs.
  • These PDFs, although considered resolved, still present NLP challenges.


RAG: Multi-Document Agents

  • Multi-Document Agents guide explains how to set up an agent that can answer different types of questions over a larger set of documents.
  • The questions include QA over a specific doc, QA comparing different docs, summaries over a specific doc, and comparing summaries between different docs.
  • The architecture involves setting up a « document agent » over each document, which can do QA/summarization within its document, and a top-level agent over this set of document agents, which can do tool retrieval and then do CoT over the set of tools to answer a question.
  • The guide provides code examples using the LlamaIndex and OpenAI libraries.
  • The document agent can dynamically choose to perform semantic search or summarization within a given document.
  • A separate document agent is created for each city.
  • The top-level agent can orchestrate across the different document agents to answer any user query.


Publié dans RAG | Marqué avec