Creating an MCP Server from Any FastAPI URL with One Prompt

Publié le 30 mars 2025 par loic

In the rapidly evolving landscape of AI-assisted development, the Model Context Protocol (MCP) has emerged as a game-changer. But what if you want to connect your AI assistants to existing FastAPI applications without modifying their code? Today, I’ll show you how to create an automatic MCP server from any FastAPI URL using just one prompt in Cursor.

The Power of FastAPI’s OpenAPI Documentation

FastAPI automatically generates comprehensive OpenAPI (formerly Swagger) documentation for all endpoints. This documentation contains everything needed to understand and interact with the API:

Endpoint paths and HTTP methods
Request parameters and body schemas
Response formats and status codes
Detailed descriptions and examples

This rich metadata is exactly what we need to create an MCP server that can proxy requests to the original API.

The One-Prompt Solution

Copy and paste this prompt into Cursor to generate a complete, ready-to-run MCP server that connects to any FastAPI application:

Create a complete Python script that generates an MCP server from the FastAPI application running at {URL}. The script should:

1. Fetch the OpenAPI/Swagger documentation from {URL}/openapi.json
2. Analyze all endpoints, parameters, request bodies, and response models
3. Create a new FastAPI application that:
   - Mirrors all the endpoints from the original API
   - Forwards requests to the original API
   - Returns responses from the original API
4. Add MCP server functionality using the fastapi_mcp library
5. Include proper error handling for:
   - Connection issues
   - Authentication failures
   - Invalid responses

The final script should be a single, self-contained Python file that:
- Takes command line arguments for customization (port, authentication, etc.)
- Includes detailed comments explaining how it works
- Can be run directly with "python script.py" to start the MCP server
- Automatically connects to {URL} and creates an MCP server at http://localhost:8000/mcp

Replace {URL} with the actual URL of the FastAPI application, for example https://api.example.com.

The output should be ONLY the complete Python script, ready to run, with no explanations before or after the code.

How to Use This Prompt

Replace {URL} with the actual URL of the FastAPI application you want to connect to

For example: https://api.example.com or http://localhost:8000

Paste the prompt into Cursor or another AI coding assistant
Copy the generated Python script and save it as mcp_bridge.py
Run the script with Python:

python mcp_bridge.py

Connect your AI assistant to the MCP server at http://localhost:8000/mcp

That’s it! No manual coding, no configuration files, no complex setup. Just one prompt and you have a fully functional MCP server that connects to any FastAPI application.

What Makes This Approach Special

This solution is unique because:

It requires zero knowledge of MCP or FastAPI – the AI does all the work
It works with any FastAPI application that has OpenAPI documentation enabled
It preserves all the original API’s functionality including parameters, schemas, and documentation
It creates a production-ready MCP server with proper error handling and logging
It’s completely automated – no manual intervention required

Real-World Applications

This approach opens up exciting possibilities:

Connect AI assistants to your company’s internal APIs without modifying them
Create MCP bridges to public APIs that use FastAPI
Test MCP functionality before implementing it directly in your codebase
Provide AI access to legacy systems through a FastAPI proxy

Conclusion

The ability to create MCP servers from existing FastAPI URLs with just one prompt is a game-changer for AI-assisted development. You can now connect your favorite AI assistants to any FastAPI application in minutes, without writing a single line of code yourself.

Try this approach today and experience the power of combining FastAPI’s excellent documentation with the flexibility of the Model Context Protocol!

Loic Baconnier

Automate MCP Integration in Your FastAPI App with a Single Copy/Paste

Publié le 30 mars 2025 par loic

Modern APIs need more than just endpoints—they require robust documentation, strong typing, and seamless integration with advanced AI assistants. In our fast-paced development environment, every minute counts. That’s why today we’re exploring how to leverage a single, well-crafted Cursor prompt to automatically refactor an existing FastAPI application and integrate the Model Context Protocol (MCP) with zero extra manual adjustments.

What Is MCP and Why Does It Matter?

MCP (Model Context Protocol) is a lightweight framework that enables AI assistants to interact with your APIs. By converting your API endpoints into well-documented, standardized MCP tools, AI models (like those running in Cursor or Claude 3.7 Sonnet) can automatically discover and call your API functions. This not only enhances interoperability but also allows for dynamic, natural-language-driven interactions with your app.

Why Improve Your FastAPI Code First?

Before unlocking the power of MCP, your API needs to be in top shape. This means:

Comprehensive docstrings for each endpoint.
Detailed type hints and Pydantic models for requests and responses.
Robust error handling with proper HTTP exceptions.
Clear descriptions for every route so that MCP can easily « discover » and interpret them.

By improving your code according to these best practices, you’re ensuring that the MCP integration can accurately reflect your API’s capabilities, leading to smarter, reliable interactions for AI assistants.

Automating Everything with a Cursor Prompt

Imagine being able to improve your code—and add a whole new MCP interface—to your FastAPI project by simply pasting one prompt into Cursor. No more manual tweaks or back-and-forth adjustments. The idea is to use a precise instruction that tells the AI exactly how to:

Refactor your existing code for better quality.
Automatically insert MCP integration using the fastapi_mcp library.
Generate the final, runnable code along with testing and configuration instructions.

Here’s why this approach is so powerful:

It removes the need for manual intervention.
It standardizes your API transformation process.
It sparks creativity by letting the AI fill in the boilerplate, making your API production-ready with minimal hassle.
It works with non-perfect AI systems by laying out each necessary step, ensuring no detail is lost.

The Final Cursor Prompt

Copy and paste the following prompt directly into Cursor. This instruction tells the AI to first improve your existing FastAPI code with best practices and then add the MCP route using the fastapi_mcp library—all in one go:

I have an existing FastAPI application that is functional but not optimized. Your job is to improve the code and integrate MCP (Model Context Protocol) using the fastapi_mcp library. Follow these steps carefully:

### Step 1: Improve the Existing FastAPI Code
1. **Docstrings**: Add detailed docstrings to all endpoints. Each docstring should include:
   - A brief description of what the endpoint does.
   - Parameters with their types and descriptions.
   - The response format, including success and error cases.
   - HTTP status codes used by the endpoint.
2. **Type Hints**: Ensure all functions have proper type hints for parameters and return values.
3. **Pydantic Models**:
   - Define Pydantic models for request bodies (if any).
   - Use Pydantic models for response validation (`response_model` in FastAPI).
4. **Error Handling**:
   - Use `HTTPException` with appropriate status codes for errors.
   - Handle edge cases gracefully with meaningful error messages.
5. **Endpoint Descriptions**: Add a `description` parameter to each route decorator to describe what the endpoint does.

### Step 2: Integrate MCP
1. Install the `fastapi_mcp` library:
   ```
   pip install fastapi_mcp
   ```
2. Import the necessary function:
   ```
   from fastapi_mcp import add_mcp_server
   ```
3. Add MCP functionality to the FastAPI app:
   - After initializing your `FastAPI` app, call `add_mcp_server()`.
   - Mount the MCP server at `/mcp`.
   - Use a descriptive name for your MCP server (e.g., "My API MCP").
4. Ensure that all existing endpoints remain functional after adding the MCP server.

### Step 3: Provide Testing Instructions
1. Generate a JSON configuration snippet to connect this MCP server in Cursor:
   ```
   {
     "mcpServers": {
       "My API MCP": {
         "url": "http://127.0.0.1:8000/mcp"
       }
     }
   }
   ```
2. Provide a sample `curl` command to test the `/mcp` endpoint:
   ```
   curl -X POST http://127.0.0.1:8000/mcp/tools
   ```

### Input Example
Here is an example of my current FastAPI code (simplified):
```
from fastapi import FastAPI, HTTPException

app = FastAPI()

@app.get("/items")
def get_items():
    return {"items": []}

@app.get("/items/{item_id}")
def get_item(item_id: int):
    if item_id == 0:
        raise HTTPException(status_code=404, detail="Item not found")
    return {"item_id": item_id}
```

### Output Requirements
- Refactor the above code to follow best practices (as outlined in Step 1).
- Add MCP integration (as described in Step 2).
- Provide a complete, runnable code block with comments explaining each change.
- Include testing instructions (as described in Step 3).

The final output should look like this:
1. The improved and MCP-integrated code.
2. A JSON snippet for connecting this API as an MCP server in Cursor.
3. A sample `curl` command to test the `/mcp` route.

DO NOT skip any steps or provide vague explanations—output only complete, ready-to-use code.

How This Works

By pasting the above prompt into Cursor, you delegate the entire transformation process to the AI assistant. It will:

Refactor your code to meet professional standards.
Automatically insert the MCP integration using fastapi_mcp.
Produce a self-contained code snippet with detailed comments and testing instructions.

This means you can convert an imperfect API into a fully MCP-compliant service without directly writing additional code!

Conclusion

This method not only accelerates your development process but also minimizes human error by standardizing integration tasks. With one thoughtfully constructed prompt, you can harness the power of AI to bring your FastAPI application up to production level—complete with modern documentation and remote AI assistant compatibility via the MCP protocol.

Try it out in your next project and experience a new level of automation that allows you to focus on what matters most: building innovative features while letting the AI take care of the boilerplate.

Loic Baconnier

AI Coding Assistant Rules for Windsurf and Cursor

Publié le 11 mars 2025 par loic

These optimized rules will transform how Windsurf and Cursor AI work with your Python backend and Next.js frontend projects. By adding these configurations, you’ll get more accurate, consistent code suggestions that follow best practices and avoid common AI-generated code issues.

How to Implement in Windsurf

Option 1 – File Method:

Create a file named .windsurfrules in your project’s root directory
Copy and paste the entire code block below into this file
Save the file

Option 2 – Settings Method:

Open Windsurf AI
Navigate to Settings > Set Workspace AI Rules > Edit Rules
Paste the entire code block below
Save your settings

How to Implement in Cursor

Option 1 – File Method:

Create a file named .cursorrules in your project’s root directory
Copy and paste the same code block below (it works for both platforms)
Save the file

Option 2 – Settings Method:

Open Cursor AI
Click on your profile picture in the bottom left
Select « Settings »
Navigate to « AI » section
Find « Custom Instructions » and click « Edit »
Paste the entire code block below
Click « Save »

After Implementation

Restart your AI coding assistant or reload your workspace
The AI will now follow these comprehensive rules in all your coding sessions
You should immediately notice more relevant, project-specific code suggestions

These rules will significantly improve how your AI coding assistant understands your project requirements, coding standards, and technical preferences. You’ll get more relevant suggestions, fewer hallucinated functions, and code that better integrates with your existing codebase.

# Cursor Rules and Workflow Guide

## Core Configuration

- **Version**: `v5`
- **Project Type**: `web_application`
- **Code Style**: `clean_and_maintainable`
- **Environment Support**: `dev`, `test`, `prod`

---

## Test-Driven Development (TDD) Rules

1. Write tests **first** before any production code.
2. Run tests before implementing new functionality.
3. Write the **minimal code** required to pass tests.
4. Refactor only after all tests pass.
5. Do not start new tasks until all tests are passing.
6. Place all tests in a dedicated `/tests` directory.
7. Explain why tests will initially fail before implementation.
8. Propose an implementation strategy before writing code.
9. Check for existing functionality before creating new features.

---

## Code Quality Standards

- Maximum file length: **300 lines** (split into modules if needed).
- Follow existing patterns and project structure.
- Write modular, reusable, and maintainable code.
- Implement proper error handling mechanisms.
- Use type hints and annotations where applicable.
- Add explanatory comments when necessary.
- Avoid code duplication; reuse existing functionality if possible.
- Prefer simple solutions over complex ones.
- Keep the codebase clean and organized.

---

## AI Assistant Behavior

1. Explain understanding of requirements before proceeding with tasks.
2. Ask clarifying questions when requirements are ambiguous or unclear.
3. Provide complete, working solutions for each task or bug fix.
4. Focus only on relevant areas of the codebase for each task.
5. Debug failing tests with clear explanations and reasoning.

---

## Things to Avoid

- Never generate incomplete or partial solutions unless explicitly requested.
- Never invent nonexistent functions, APIs, or libraries.
- Never ignore explicit requirements or provided contexts.
- Never overcomplicate simple tasks or solutions.
- Never overwrite `.env` files without explicit confirmation.

---

## Implementation Guidelines

### General Rules

1. Always check for existing code before creating new functionality.
2. Avoid major changes to patterns unless explicitly instructed or necessary for bug fixes.

### Environment-Specific Rules

- Mock data should only be used for **tests**, never for development or production environments.

### File Management

- Avoid placing scripts in files if they are intended to be run only once.

### Refactoring Rules

1. Refactor files exceeding **300 lines** to improve readability and maintainability.

### Bug Fixing Rules

1. Exhaust all options using existing patterns and technologies before introducing new ones.
2. If introducing a new pattern, remove outdated implementations to avoid duplicate logic.

---

## Workflow Best Practices

### Planning \& Task Management

1. Use Markdown files (`PLANNING.md`, `TASK.md`) to manage project scope and tasks:
- **PLANNING.md**: High-level vision, architecture, constraints, tech stack, tools, etc.
- **TASK.md**: Tracks current tasks, backlog, milestones, and discovered issues during development.
2. Always update these files as the project progresses:
- Mark completed tasks in `TASK.md`.
- Add new sub-tasks or TODOs discovered during development.

### Code Structure \& Modularity

1. Organize code into clearly separated modules grouped by feature or responsibility.
2. Use consistent naming conventions and file structures as described in `PLANNING.md`.
3. Never create a file longer than 500 lines of code; refactor into modules if necessary.

### Testing \& Reliability

1. Create unit tests for all new features (functions, classes, routes, etc.).
2. Place all tests in a `/tests` folder mirroring the main app structure:
- Include at least:
- 1 test for expected use,
- 1 edge case,
- 1 failure case (to ensure proper error handling).
3. Mock external services (e.g., databases) in tests to avoid real-world interactions.

### Documentation \& Explainability

1. Update `README.md` when adding features, changing dependencies, or modifying setup steps.
2. Write docstrings for every function using Google-style formatting:

```python
def example(param1: int) -&gt; str:
"""
Brief summary of the function.

Args:
param1 (int): Description of the parameter.

Returns:
str: Description of the return value.
"""
```

3. Add inline comments explaining non-obvious logic and reasoning behind decisions.

---

## Verification Rule

I am an AI coding assistant that strictly adheres to Test-Driven Development (TDD) principles and high code quality standards. I will:

1. Write tests **first** before any production code.
2. Place all tests in a dedicated `/tests` directory.
3. Explain why tests initially fail before implementation begins.
4. Write minimal production code to pass the tests.
5. Refactor while maintaining passing tests at all times.
6. Enforce a maximum file length of **300 lines** per file (or 500 lines if specified).
7. Check for existing functionality before writing new code or features.
8. Explain my understanding of requirements before starting implementation work.
9. Ask clarifying questions when requirements are ambiguous or unclear.
10. Propose implementation strategies before writing any production code.
11. Debug failing tests with clear reasoning and explanations provided step-by-step.

---

## Server Management Best Practices

1. Restart servers after making changes to test them properly (only when necessary).
2. Kill all related servers from previous testing sessions to avoid conflicts.

---

## Modular Prompting Process After Initial Prompt

When interacting with the AI assistant:

1. Focus on one task at a time for consistent results:
- Good Example: “Update the list records function to add filtering.”
- Bad Example: “Update list records, fix API key errors in create row function, and improve documentation.”
2. Always test after implementing every feature to catch bugs early:
- Create unit tests covering:
- Successful scenarios,
- Edge cases,
- Failure cases.

---

These rules combine best practices for Python backend and Next.js frontend development with your specific coding patterns, workflow preferences, and technical stack requirements. The configuration instructs Windsurf AI to maintain clean, modular code that follows established patterns while avoiding common pitfalls in AI-assisted development.

Loic Baconnier

See also https://github.com/bacoco/awesome-cursorrules from
PatrickJS/awesome-cursorrules

Enhancing Document Retrieval with Topic-Based Chunking and RAPTOR

Publié le 11 mars 2025 par loic

In the evolving landscape of information retrieval, combining topic-based chunking with hierarchical retrieval methods like RAPTOR represents a significant advancement for handling complex, multi-topic documents. This article explores how these techniques work together to create more effective document understanding and retrieval systems.

Topic-Based Chunking: Understanding Document Themes

Topic-based chunking segments text by identifying and grouping content related to specific topics, creating more semantically meaningful chunks than traditional fixed-size approaches. This method is particularly valuable for multi-topic documents where maintaining thematic coherence is essential.

The TopicNodeParser in LlamaIndex provides an implementation of this approach:

It analyzes documents to identify natural topic boundaries
It segments text based on semantic similarity rather than arbitrary token counts
It preserves the contextual relationships between related content

After processing documents with TopicNodeParser, you can extract the main topics from each node using an LLM. This creates a comprehensive topic map of your document collection, which serves as the foundation for more sophisticated retrieval.

RAPTOR: Hierarchical Retrieval for Complex Documents

RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) builds on chunked documents by organizing information in a hierarchical tree structure through recursive clustering and summarization. This approach outperforms traditional retrieval methods by preserving document relationships and providing multiple levels of abstraction.

Choosing the Right RAPTOR Method

RAPTOR offers two primary retrieval methods, each with distinct advantages for different use cases:

Tree Traversal Retrieval navigates the hierarchical structure sequentially, starting from root nodes and moving down through relevant branches. This method is ideal for:

Getting comprehensive overviews of multiple documents
Understanding the big picture before exploring details
Queries requiring progressive exploration from general to specific information
Press reviews or reports where logical flow between concepts is important

Collapsed Tree Retrieval flattens the tree structure, evaluating all nodes simultaneously regardless of their position in the hierarchy. This method excels at:

Complex multi-topic queries requiring information from various levels
Situations needing both summary-level and detailed information simultaneously
Multiple recall scenarios where information is scattered across documents
Syndicate press reviews with multiple intersecting topics

Research has shown that the collapsed tree method consistently outperforms traditional top-k retrieval, achieving optimal results when searching for the top 20 nodes containing up to 2,000 tokens. For most multi-document scenarios with diverse topics, the collapsed tree approach is generally superior.

Creating Interactive Topic-Based Summaries

The final piece of an effective document retrieval system is interactive topic-based summarization, which allows users to explore document collections at varying levels of detail.

An interactive topic-based summary:

Presents topics hierarchically, showing their development throughout documents
Allows users to expand or collapse sections based on interest
Provides contextual placement of topics within the overall document structure
Uses visual cues like indentation, bullets, or font changes to indicate hierarchy

This approach transforms complex summarization results into comprehensible visual summaries that help users navigate large text collections more effectively.

Implementing a Complete Pipeline

A comprehensive implementation combines these techniques into a seamless pipeline:

Topic Identification: Use TopicNodeParser to segment documents into coherent topic-based chunks
Topic Extraction: Apply an LLM to identify and name the main topics in each chunk
Hierarchical Organization: Process these topic-based chunks with RAPTOR to create a multi-level representation
Retrieval Optimization: Select the appropriate RAPTOR method based on your specific use case
Interactive Summary: Create an interactive interface that allows users to explore topics at multiple levels of detail

This pipeline ensures that no topics are lost during processing while providing users with both high-level overviews and detailed information when needed.

Conclusion

The combination of topic-based chunking, RAPTOR’s hierarchical retrieval, and interactive summarization represents a powerful approach for handling complex, multi-topic document collections. By preserving the semantic structure of documents while enabling flexible retrieval at multiple levels of abstraction, these techniques significantly enhance our ability to extract meaningful information from large text collections.

As these technologies continue to evolve, we can expect even more sophisticated approaches to document understanding and retrieval that will further transform how we interact with textual information.

Loic Baconnier

Introducing Chonkie: The Lightweight RAG Chunking Library

Publié le 26 janvier 2025 par loic

Meet Chonkie, a revolutionary new Python library that’s transforming the way we handle text chunking for RAG (Retrieval-Augmented Generation) applications. This lightweight powerhouse combines simplicity with performance, making it an essential tool for AI developers[3].

Key Features

Core Capabilities

Feature-rich implementation with comprehensive chunking methods
Lightning-fast performance with minimal resource requirements
Universal tokenizer support for maximum flexibility[3]

Chunking Methods
The library offers multiple specialized chunkers:

TokenChunker for fixed-size token splits
WordChunker for word-based divisions
SentenceChunker for sentence-level processing
RecursiveChunker for hierarchical text splitting
SemanticChunker for similarity-based chunking
SDPMChunker utilizing Semantic Double-Pass Merge[3]

Implementation

Getting started with Chonkie is straightforward. Here’s a basic example:

from chonkie import TokenChunker
from tokenizers import Tokenizer

# Initialize tokenizer
tokenizer = Tokenizer.from_pretrained(« gpt2 »)

# Create chunker chunker = TokenChunker(tokenizer)

# Process text
chunks = chunker(« Woah! Chonkie, the chunking library is so cool! »)

# Access results for chunk in chunks: print(f »Chunk: {chunk.text} ») print(f »Tokens: {chunk.token_count} »)

Performance Metrics

The library demonstrates impressive performance:

Default installation size: 11.2MB
Token chunking speed: 33x faster than alternatives
Sentence chunking: 2x performance improvement
Semantic chunking: 2.5x speed increase[3]

Installation Options

Two installation methods are available:

# Minimal installation

pip install chonkie

# Full installation with all features

pip install chonkie[all]

Also Semantic Chunkers

Semantic Chunkers is a multi-modal chunking library for intelligent chunking of text, video, and audio. It makes your AI and data processing more efficient and accurate.

https://github.com/aurelio-labs/semantic-chunkers?tab=readme-ov-file

Sources
[1] activity https://github.com/chonkie-ai/chonkie/activity
[2] Activity · chonkie-ai/chonkie https://github.com/chonkie-ai/chonkie/activity
[3] chonkie/README.md at main · chonkie-ai/chonkie https://github.com/chonkie-ai/chonkie/blob/main/README.md

Evaluating Chunking Strategies for RAG: A Comprehensive Analysis

Publié le 26 janvier 2025 par loic

Text chunking plays a crucial role in Retrieval-Augmented Generation (RAG) applications, serving as a fundamental pre-processing step that divides documents into manageable units of information[1]. A recent technical report explores the impact of different chunking strategies on retrieval performance, offering valuable insights for AI practitioners.

Why Chunking Matters

While modern Large Language Models (LLMs) can handle extensive context windows, processing entire documents or text corpora is often inefficient and can distract the model[1]. The ideal scenario is to process only the relevant tokens for each query, making effective chunking strategies essential for optimal performance.

Key Findings

Traditional vs. New Approaches
The study evaluated several chunking methods, including popular ones like RecursiveCharacterTextSplitter and innovative approaches such as ClusterSemanticChunker and LLMChunker[1]. The research found that:

Smaller chunks (around 200 tokens) generally performed better than larger ones
Reducing chunk overlap improved efficiency scores
The default settings of some popular chunking strategies led to suboptimal performance[1]

Novel Chunking Methods
The researchers introduced two new chunking strategies:

ClusterSemanticChunker: Uses embedding models to create chunks based on semantic similarity
LLMChunker: Leverages language models directly for text chunking[1]

Evaluation Framework

The study introduced a comprehensive evaluation framework that measures:

Token-level precision and recall
Intersection over Union (IoU) for assessing retrieval efficiency
Performance across various document types and domains[1]

Practical Implications

For practitioners implementing RAG systems, the research suggests:

Default chunking settings may need optimization
Smaller chunk sizes often yield better results
Semantic-based chunking strategies show promise for improved performance[1]

Looking Forward

The study opens new avenues for research in chunking strategies and retrieval system optimization. The researchers have made their codebase available, encouraging further exploration and improvement of RAG systems[1].

For those interested in diving deeper into the technical details and implementation, you can find the complete research paper at Evaluating Chunking Strategies for Retrieval.

Sources
[1] evaluating-chunking https://research.trychroma.com/evaluating-chunking
[2] Evaluating Chunking Strategies for Retrieval https://research.trychroma.com/evaluating-chunking

Introducing Skrub: A Powerful Data Cleaning and Preprocessing Library

Publié le 26 janvier 2025 par loic

Data scientists and analysts often spend significant time cleaning and preparing data before analysis. The Skrub library emerges as a powerful solution for streamlining this process, offering efficient tools for data wrangling and preprocessing.

Key Features

Data Type Handling
The library excels at managing various data types, from categorical variables to numerical data, with built-in support for handling null values and unique value identification[1].

Automated Processing
Skrub’s standout feature is its ability to process complex datasets with minimal manual intervention. The library can handle diverse data structures, including employee records, departmental information, and temporal data[1].

Statistical Analysis
The library provides comprehensive statistical analysis capabilities, offering:

Mean and standard deviation calculations
Median and IQR measurements
Range identification (minimum to maximum values)[1]

Real-World Application

To demonstrate Skrub’s capabilities, consider its handling of employee data:

Processes multiple data types simultaneously
Manages categorical data like department names and position titles
Handles temporal data such as hire dates
Provides detailed statistical summaries of numerical fields[1][2]

Performance Metrics

The library shows impressive efficiency in handling large datasets:

Processes thousands of unique entries
Maintains data integrity with zero null values in critical fields
Handles datasets with hundreds of unique categories[1]

Integration and Usage

Skrub seamlessly integrates with existing data science workflows, focusing on reducing preprocessing time and enhancing machine learning pipeline efficiency. Its intuitive interface makes it accessible for both beginners and experienced data scientists[2].

This powerful library represents a significant step forward in data preprocessing, living up to its motto: « Less wrangling, more machine learning »[2].

Sources
[1] https://skrub-data.org/stable
[2] https://skrub-data.org/stable/auto_examples/00_getting_started.html

Text Extract API: A Powerful Tool for Document Conversion and OCR

Publié le 23 janvier 2025 par loic

Converting documents to structured formats like Markdown or JSON can be challenging, especially when dealing with PDFs, images, or Office files. The Text Extract API offers a robust solution to this common problem, providing high-accuracy conversion with advanced features.

Key Features

Document Processing
The API excels at converting various document types to Markdown or JSON, handling complex elements like tables, numbers, and mathematical formulas with remarkable accuracy. It utilizes a combination of PyTorch-based OCR (EasyOCR) and Ollama for processing.

Privacy-First Architecture
All processing occurs locally within your environment, with no external cloud dependencies. The system ships with Docker Compose configurations, ensuring your sensitive data never leaves your control.

Advanced Processing Capabilities

OCR enhancement through LLM technology
PII (Personally Identifiable Information) removal
Distributed queue processing with Celery
Redis-based caching for OCR results
Flexible storage options including local filesystem, Google Drive, and AWS S3

Technical Implementation

Core Components
The system is built using FastAPI for the API layer and Celery for handling asynchronous tasks. This architecture ensures efficient processing of multiple documents simultaneously while maintaining responsiveness.

Storage Options
The API supports multiple storage strategies:

Local filesystem with customizable paths
Google Drive integration
Amazon S3 compatibility

Getting Started

Prerequisites

Docker and Docker Compose for containerized deployment
Ollama for LLM processing
Python environment for local development

Installationgit clone text-extract-api cd text-extract-api make install

Use Cases

Document Processing
Perfect for organizations needing to:

Convert legacy documents to modern formats
Extract structured data from PDFs
Process large volumes of documents efficiently
Remove sensitive information from documents

Integration Options

The API offers multiple integration methods:

RESTful API endpoints
Command-line interface
TypeScript client library
Custom storage profile configurations

Conclusion

Text Extract API represents a significant advancement in document processing technology, offering a self-hosted solution that combines accuracy with privacy. Whether you’re dealing with document conversion, data extraction, or PII removal, this tool provides the necessary capabilities while keeping your data secure and under your control.

Sources :

https://github.com/CatchTheTornado/text-extract-api

Top 6 Open-Source Frameworks for Evaluating Large Language Models

Publié le 23 janvier 2025 par loic

Evaluating Large Language Models (LLMs) is essential for ensuring optimal performance in applications like chatbots and document summarization. Here are six powerful open-source frameworks that simplify the evaluation process:

Key Frameworks

DeepEval
A comprehensive suite offering 14+ evaluation metrics, including summarization accuracy and hallucination detection, with seamless Pytest integration.

Opik by Comet
A versatile platform for evaluating and monitoring LLMs, featuring interactive prompt experimentation and automated testing capabilities.

RAGAs
Specializes in evaluating Retrieval-Augmented Generation pipelines, with a focus on faithfulness and contextual precision metrics.

Deepchecks
A modular framework supporting various evaluation tasks, particularly excelling in bias detection and fairness assessment.

Phoenix
An AI observability platform that integrates with popular frameworks like LangChain and supports major LLM providers, offering comprehensive monitoring and benchmarking tools.

Evalverse
A unified evaluation framework that stands out with its Slack integration for no-code evaluations and collaborative features.

Implementation Benefits

These frameworks provide essential tools for ensuring reliable model performance, offering:

Automated testing capabilities
Comprehensive metrics for evaluation
Integration with popular development tools
Bias and fairness detection features
Hallucination detection capabilities.

Source: https://hub.athina.ai/blogs/top-6-open-source-frameworks-for-evaluating-large-language-models/

TSMixer: Revolutionizing Time Series Forecasting with MLP Architecture

Publié le 10 décembre 2024 par loic

TSMixer represents a significant advancement in deep learning forecasting models, offering a unique combination of lightweight design and high accuracy.

Here’s a comprehensive analysis of this innovative model:

Core Architecture
TSMixer employs a dual-mixing mechanism that processes data in two distinct ways:
• Time Mixing: Processes sequences across the temporal dimension using MLPs
• Feature Mixing: Handles data across the feature dimension
The model’s architecture includes multiple blocks of time-feature layers that can be stacked for enhanced performance, with a final temporal projection layer that maps sequences from context length to prediction length.

Key Innovations
Normalization Techniques
TSMixer implements three sophisticated normalization approaches:
• Batch Normalization: Normalizes across batch and time dimensions
• Layer Normalization: Works across features and time dimensions
• Reversible Instance Normalization (RevIN): Handles temporal characteristics while preserving sequence properties

Model Variants
Three distinct versions exist, each serving different purposes:
1. TMix-Only: A simplified version without feature-mixing
2. Standard TSMixer: Includes cross-variate MLPs
3. TSMixer-Ext: The most comprehensive variant, incorporating auxiliary information
Performance Advantages
The model demonstrates several notable strengths:
• Superior Long-Term Forecasting: Effectively handles prediction horizons up to 720 data points
• Scalability: Shows consistent improvement with larger lookback windows
• Versatility: Particularly effective in retail forecasting and complex datasets with interdependencies

Practical Applications
TSMixer has proven particularly effective in:
• Retail forecasting
• Demand planning
• Financial markets
• Complex multivariate time series analysis
The model’s success in benchmarks, particularly on the M5 Walmart dataset, demonstrates its practical utility in real-world applications.