AI Coding Assistant Rules for Windsurf and Cursor

These optimized rules will transform how Windsurf and Cursor AI work with your Python backend and Next.js frontend projects. By adding these configurations, you’ll get more accurate, consistent code suggestions that follow best practices and avoid common AI-generated code issues.

How to Implement in Windsurf

  1. Option 1 – File Method:
  • Create a file named .windsurfrules in your project’s root directory
  • Copy and paste the entire code block below into this file
  • Save the file
  1. Option 2 – Settings Method:
  • Open Windsurf AI
  • Navigate to Settings > Set Workspace AI Rules > Edit Rules
  • Paste the entire code block below
  • Save your settings

How to Implement in Cursor

  1. Option 1 – File Method:
  • Create a file named .cursorrules in your project’s root directory
  • Copy and paste the same code block below (it works for both platforms)
  • Save the file
  1. Option 2 – Settings Method:
  • Open Cursor AI
  • Click on your profile picture in the bottom left
  • Select « Settings »
  • Navigate to « AI » section
  • Find « Custom Instructions » and click « Edit »
  • Paste the entire code block below
  • Click « Save »

After Implementation

  • Restart your AI coding assistant or reload your workspace
  • The AI will now follow these comprehensive rules in all your coding sessions
  • You should immediately notice more relevant, project-specific code suggestions

These rules will significantly improve how your AI coding assistant understands your project requirements, coding standards, and technical preferences. You’ll get more relevant suggestions, fewer hallucinated functions, and code that better integrates with your existing codebase.

# Cursor Rules and Workflow Guide

## Core Configuration

- **Version**: `v5`
- **Project Type**: `web_application`
- **Code Style**: `clean_and_maintainable`
- **Environment Support**: `dev`, `test`, `prod`

---

## Test-Driven Development (TDD) Rules

1. Write tests **first** before any production code.
2. Run tests before implementing new functionality.
3. Write the **minimal code** required to pass tests.
4. Refactor only after all tests pass.
5. Do not start new tasks until all tests are passing.
6. Place all tests in a dedicated `/tests` directory.
7. Explain why tests will initially fail before implementation.
8. Propose an implementation strategy before writing code.
9. Check for existing functionality before creating new features.

---

## Code Quality Standards

- Maximum file length: **300 lines** (split into modules if needed).
- Follow existing patterns and project structure.
- Write modular, reusable, and maintainable code.
- Implement proper error handling mechanisms.
- Use type hints and annotations where applicable.
- Add explanatory comments when necessary.
- Avoid code duplication; reuse existing functionality if possible.
- Prefer simple solutions over complex ones.
- Keep the codebase clean and organized.

---

## AI Assistant Behavior

1. Explain understanding of requirements before proceeding with tasks.
2. Ask clarifying questions when requirements are ambiguous or unclear.
3. Provide complete, working solutions for each task or bug fix.
4. Focus only on relevant areas of the codebase for each task.
5. Debug failing tests with clear explanations and reasoning.

---

## Things to Avoid

- Never generate incomplete or partial solutions unless explicitly requested.
- Never invent nonexistent functions, APIs, or libraries.
- Never ignore explicit requirements or provided contexts.
- Never overcomplicate simple tasks or solutions.
- Never overwrite `.env` files without explicit confirmation.

---

## Implementation Guidelines

### General Rules

1. Always check for existing code before creating new functionality.
2. Avoid major changes to patterns unless explicitly instructed or necessary for bug fixes.

### Environment-Specific Rules

- Mock data should only be used for **tests**, never for development or production environments.


### File Management

- Avoid placing scripts in files if they are intended to be run only once.


### Refactoring Rules

1. Refactor files exceeding **300 lines** to improve readability and maintainability.

### Bug Fixing Rules

1. Exhaust all options using existing patterns and technologies before introducing new ones.
2. If introducing a new pattern, remove outdated implementations to avoid duplicate logic.

---

## Workflow Best Practices

### Planning \& Task Management

1. Use Markdown files (`PLANNING.md`, `TASK.md`) to manage project scope and tasks:
    - **PLANNING.md**: High-level vision, architecture, constraints, tech stack, tools, etc.
    - **TASK.md**: Tracks current tasks, backlog, milestones, and discovered issues during development.
2. Always update these files as the project progresses:
    - Mark completed tasks in `TASK.md`.
    - Add new sub-tasks or TODOs discovered during development.

### Code Structure \& Modularity

1. Organize code into clearly separated modules grouped by feature or responsibility.
2. Use consistent naming conventions and file structures as described in `PLANNING.md`.
3. Never create a file longer than 500 lines of code; refactor into modules if necessary.

### Testing \& Reliability

1. Create unit tests for all new features (functions, classes, routes, etc.).
2. Place all tests in a `/tests` folder mirroring the main app structure:
    - Include at least:
        - 1 test for expected use,
        - 1 edge case,
        - 1 failure case (to ensure proper error handling).
3. Mock external services (e.g., databases) in tests to avoid real-world interactions.

### Documentation \& Explainability

1. Update `README.md` when adding features, changing dependencies, or modifying setup steps.
2. Write docstrings for every function using Google-style formatting:

```python
def example(param1: int) -> str:
    """
    Brief summary of the function.

    Args:
        param1 (int): Description of the parameter.

    Returns:
        str: Description of the return value.
    """
```

3. Add inline comments explaining non-obvious logic and reasoning behind decisions.

---

## Verification Rule

I am an AI coding assistant that strictly adheres to Test-Driven Development (TDD) principles and high code quality standards. I will:

1. Write tests **first** before any production code.
2. Place all tests in a dedicated `/tests` directory.
3. Explain why tests initially fail before implementation begins.
4. Write minimal production code to pass the tests.
5. Refactor while maintaining passing tests at all times.
6. Enforce a maximum file length of **300 lines** per file (or 500 lines if specified).
7. Check for existing functionality before writing new code or features.
8. Explain my understanding of requirements before starting implementation work.
9. Ask clarifying questions when requirements are ambiguous or unclear.
10. Propose implementation strategies before writing any production code.
11. Debug failing tests with clear reasoning and explanations provided step-by-step.

---

## Server Management Best Practices

1. Restart servers after making changes to test them properly (only when necessary).
2. Kill all related servers from previous testing sessions to avoid conflicts.

---

## Modular Prompting Process After Initial Prompt

When interacting with the AI assistant:

1. Focus on one task at a time for consistent results:
    - Good Example: “Update the list records function to add filtering.”
    - Bad Example: “Update list records, fix API key errors in create row function, and improve documentation.”
2. Always test after implementing every feature to catch bugs early:
    - Create unit tests covering:
        - Successful scenarios,
        - Edge cases,
        - Failure cases.

---


These rules combine best practices for Python backend and Next.js frontend development with your specific coding patterns, workflow preferences, and technical stack requirements. The configuration instructs Windsurf AI to maintain clean, modular code that follows established patterns while avoiding common pitfalls in AI-assisted development.

Loic Baconnier

See also https://github.com/bacoco/awesome-cursorrules from
PatrickJS/awesome-cursorrules

Enhancing Document Retrieval with Topic-Based Chunking and RAPTOR

In the evolving landscape of information retrieval, combining topic-based chunking with hierarchical retrieval methods like RAPTOR represents a significant advancement for handling complex, multi-topic documents. This article explores how these techniques work together to create more effective document understanding and retrieval systems.

Topic-Based Chunking: Understanding Document Themes

Topic-based chunking segments text by identifying and grouping content related to specific topics, creating more semantically meaningful chunks than traditional fixed-size approaches. This method is particularly valuable for multi-topic documents where maintaining thematic coherence is essential.

The TopicNodeParser in LlamaIndex provides an implementation of this approach:

  1. It analyzes documents to identify natural topic boundaries
  2. It segments text based on semantic similarity rather than arbitrary token counts
  3. It preserves the contextual relationships between related content

After processing documents with TopicNodeParser, you can extract the main topics from each node using an LLM. This creates a comprehensive topic map of your document collection, which serves as the foundation for more sophisticated retrieval.

RAPTOR: Hierarchical Retrieval for Complex Documents

RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) builds on chunked documents by organizing information in a hierarchical tree structure through recursive clustering and summarization. This approach outperforms traditional retrieval methods by preserving document relationships and providing multiple levels of abstraction.

Choosing the Right RAPTOR Method

RAPTOR offers two primary retrieval methods, each with distinct advantages for different use cases:

Tree Traversal Retrieval navigates the hierarchical structure sequentially, starting from root nodes and moving down through relevant branches. This method is ideal for:

  • Getting comprehensive overviews of multiple documents
  • Understanding the big picture before exploring details
  • Queries requiring progressive exploration from general to specific information
  • Press reviews or reports where logical flow between concepts is important

Collapsed Tree Retrieval flattens the tree structure, evaluating all nodes simultaneously regardless of their position in the hierarchy. This method excels at:

  • Complex multi-topic queries requiring information from various levels
  • Situations needing both summary-level and detailed information simultaneously
  • Multiple recall scenarios where information is scattered across documents
  • Syndicate press reviews with multiple intersecting topics

Research has shown that the collapsed tree method consistently outperforms traditional top-k retrieval, achieving optimal results when searching for the top 20 nodes containing up to 2,000 tokens. For most multi-document scenarios with diverse topics, the collapsed tree approach is generally superior.

Creating Interactive Topic-Based Summaries

The final piece of an effective document retrieval system is interactive topic-based summarization, which allows users to explore document collections at varying levels of detail.

An interactive topic-based summary:

  • Presents topics hierarchically, showing their development throughout documents
  • Allows users to expand or collapse sections based on interest
  • Provides contextual placement of topics within the overall document structure
  • Uses visual cues like indentation, bullets, or font changes to indicate hierarchy

This approach transforms complex summarization results into comprehensible visual summaries that help users navigate large text collections more effectively.

Implementing a Complete Pipeline

A comprehensive implementation combines these techniques into a seamless pipeline:

  1. Topic Identification: Use TopicNodeParser to segment documents into coherent topic-based chunks
  2. Topic Extraction: Apply an LLM to identify and name the main topics in each chunk
  3. Hierarchical Organization: Process these topic-based chunks with RAPTOR to create a multi-level representation
  4. Retrieval Optimization: Select the appropriate RAPTOR method based on your specific use case
  5. Interactive Summary: Create an interactive interface that allows users to explore topics at multiple levels of detail

This pipeline ensures that no topics are lost during processing while providing users with both high-level overviews and detailed information when needed.

Conclusion

The combination of topic-based chunking, RAPTOR’s hierarchical retrieval, and interactive summarization represents a powerful approach for handling complex, multi-topic document collections. By preserving the semantic structure of documents while enabling flexible retrieval at multiple levels of abstraction, these techniques significantly enhance our ability to extract meaningful information from large text collections.

As these technologies continue to evolve, we can expect even more sophisticated approaches to document understanding and retrieval that will further transform how we interact with textual information.

Loic Baconnier