- SFR-Judge is a family of three judge models (8B, 12B, and 70B parameters) developed by Salesforce AI Research.
- These models are built using Meta Llama 3 and Mistral NeMO, designed to evaluate outputs from large language models (LLMs).
- SFR-Judge can perform three types of evaluation tasks:
- Pairwise comparisons
- Single ratings on a Likert scale
- Binary classification.
- The models are trained to provide explanations for their judgments, enhancing transparency.
- SFR-Judge outperformed other open-source and proprietary judge models in 10 out of 13 benchmarks.
- The models demonstrated lower bias and higher consistency compared to competitive judge models.
- SFR-Judge models ranked first, second, and fourth on the RewardBench leaderboard for generative judge models.
- These models are the first to achieve over 90% accuracy on RewardBench.
- SFR-Judge can be used for auto-evaluation and as reward models for reinforcement learning from human feedback (RLHF).
- Downstream models improved with SFR-Judge showed better performance on the AlpacaEval-2 instruction following benchmark.
- The research paper and code (coming soon) are available for further explorations.
Archives de l’auteur : loic
Mastering the Art of Prompt Engineering: 20 Essential Tips
Prompt engineering has become a crucial skill in the era of advanced language models. Whether you’re a developer, researcher, or enthusiast working with AI, understanding how to effectively communicate with these models can significantly enhance your results. Here are 20 key tips to improve your prompt engineering skills:
Communication and Clarity
- Communicate clearly and concisely: Precision in your language is paramount when interacting with AI models.
- Give specific instructions: Provide clear, concise directions that are tailored to your particular task.
- Anticipate misinterpretations: Consider how the model might misunderstand your prompts and preemptively address potential issues.
Experimentation and Learning
- Iterate and experiment: Don’t be afraid to try different approaches with your prompts.
- Learn from mistakes: Carefully analyze the model’s outputs to understand where improvements can be made.
- Push boundaries: Challenge your assumptions about the model’s capabilities.
Understanding the Model
- Think of it as a knowledgeable temp: Imagine the model as a highly informed temporary worker who needs specific guidance.
- Provide context: Don’t hesitate to give more background information than you think is necessary.
- Avoid forcing personas: Let the model’s natural capabilities shine instead of trying to make it play a specific role.
Effective Prompting Techniques
- Use illustrative examples: Provide examples to clarify your task, but be mindful not to overwhelm the model.
- Diversify your examples: Use instances that differ from the data the model will actually work with.
- Mind your language: While good grammar and punctuation are helpful, they’re not strictly necessary for the model to understand you.
- Consider the model as an imitator: Remember that the AI will attempt to mimic your writing style.
- Leverage other models: Use different AI models to help craft your prompts.
Respecting the Model’s Nature
- Treat it with respect: Approach the model as if it were an intelligent and capable entity.
- Simulate the model’s perspective: Try to put yourself in the AI’s position to better understand its responses.
- Be creative with concepts: Don’t shy away from introducing new ideas to convey your intentions to the model.
- Explain as if to a layperson: Frame your prompts as if you’re explaining the topic to an educated person unfamiliar with the subject.
- Provide an « out »: Give the model a clear way to respond when it encounters unexpected inputs.
- Externalize your thinking: Try to transfer your thought process into the prompt for the model to follow.
By incorporating these tips into your prompt engineering practice, you can significantly improve your interactions with AI language models. Remember that the effectiveness of these strategies may vary depending on the specific task and model you’re working with. Continuous experimentation and refinement of your approach will lead to the best results in prompt engineering.
Sources
VisionTS: Revolutionizing Time Series Forecasting with Image-Based Models
Challenges in Time Series Forecasting Models
The biggest challenge in building a pre-trained model for time series is finding high-quality and diverse data. This difficulty is at the core of developing effective forecasting models.
Main Approaches
Two primary approaches are used to build a fundamental forecasting model:
- Adapting an LLM: This method involves repurposing a pre-trained language model like GPT-4 or Llama by adapting it to time series tasks.
- Building from Scratch: This approach involves creating a vast time series dataset to pre-train a model, hoping it will generalize to new data.
Results and Limitations
The second approach has proven more effective, as evidenced by models such as MOIRAI, TimesFM, and TTM. However, these models follow scaling laws, and their performance heavily depends on the availability of extensive time series data, which brings us back to the initial challenge.
Innovation: Using Images
Faced with these limitations, an innovative approach was explored: using a different modality, namely images. Although counterintuitive, this method has produced groundbreaking results, opening new perspectives in the field of time series forecasting.
VisionTS: A New Paradigm
VisionTS represents a novel approach that leverages the power of image-based models for time series forecasting. This method transforms time series data into images, allowing the use of advanced computer vision techniques to predict future values.
Advantages of Image-Based Forecasting
Using images for time series forecasting offers several advantages:
- Access to a vast pool of pre-trained image models
- Ability to capture complex patterns and relationships in data
- Potential for transfer learning from diverse image datasets
Future Implications
The success of VisionTS suggests a promising direction for future research in time series forecasting. It demonstrates the potential of cross-modal learning and opens up new possibilities for improving prediction accuracy and generalization in various domains.
paper:
https://arxiv.org/pdf/2408.17253
Code:
TIME-MOE : Time series
BILLION-SCALE TIME SERIES FOUNDATION MODELS From Princeton WITH MIXTURE OF EXPERTS
TIME-MOE is a scalable and unified architecture designed for pre-training large, capable forecasting foundation models while reducing inference costs. It addresses the limitations of current pre-trained time series models, which are often limited in scale and operate at high costs.
Key Features
- Sparse Mixture-of-Experts (MoE) Design: Enhances computational efficiency by activating only a subset of networks for each prediction.
- Scalability: Allows for effective scaling without a corresponding increase in inference costs.
- Flexibility: Supports flexible forecasting horizons with varying input context lengths.
Architecture
- Decoder-only transformer models
- Operates in an autoregressive manner
- Family of models scaling up to 2.4 billion parameters
Training Data
- Pre-trained on Time-300B dataset
- Spans over 9 domains
- Encompasses over 300 billion time points
Performance
- Achieves significantly improved forecasting precision
- Outperforms dense models with equivalent computation budgets or activated parameters
Applications
Positioned as a state-of-the-art solution for real-world time series forecasting challenges, offering superior capability, efficiency, and flexibility.
https://arxiv.org/pdf/2409.16040
Code:
Scalable. Interactive. Interpretable Data Science
Safe, interpretable, trustworthy AI, through interactive intelligent visualization, with applications in adversarial machine learning (protecting AI from harm and doing harm), scalable discoveries of deep learning models, and inclusive AI for everyone.
A Must.
Transformer Explainer
Learn How Transformer Models Work with Interactive Visualization
https://poloclub.github.io/transformer-explainer/
Diffusion Explainer
Learn how Stable Diffusion transforms your text prompt into image.
Open Source LLM Tools
Open Source LLM Tools
If you are looking for useful open-source LLM tools, this is a really useful resource.
It includes different categories like tutorials, AI engineering, and applications, among others. You can also see the # of GitHub stars.
User-Centric RAG
Transforming RAG with LlamaIndex Multi-Agent System and Qdrant
Retrieval-Augmented Generation (RAG) models have evolved significantly over time. Initially, traditional RAG systems faced numerous limitations. However, with advancements in the field, we have seen the emergence of more sophisticated RAG applications. Techniques such as Self-RAG, Hybrid Search RAG, experimenting with different prompting and chunking strategies, and the evolution of Agentic RAG have addressed many of the initial limitations.
Graph entity for RAG
High-efficiency production-scale entity relationship extraction
Outperforming Claude 3.5 Sonnet with Phi-3-mini-4k for graph entity relationship extraction tasks
https://huggingface.co/spaces/EmergentMethods/Phi-3-mini-instruct-graph
PDF-Extract-Kit
PDF-Extract-Kit
, a comprehensive toolkit for high-quality PDF content extraction, including layout detection
, formula detection
, formula recognition
, and OCR
.
PDF documents contain a wealth of knowledge, yet extracting high-quality content from PDFs is not an easy task. To address this, we have broken down the task of PDF content extraction into several components:
- Layout Detection: Using the LayoutLMv3model for region detection, such as
images
,tables
,titles
,text
, etc.; - Formula Detection: Using YOLOv8 for detecting formulas, including
inline formulas
andisolated formulas
; - Formula Recognition: Using UniMERNet for formula recognition;
- Table Recognition: Using StructEqTable for table recognition;
- Optical Character Recognition: Using PaddleOCR for text recognition;
https://github.com/opendatalab/PDF-Extract-Kit
https://www.perplexity.ai/search/look-at-this-github-https-gith-8ZVtYO.2SA6_q5Vg.VXy.g
BERTopic
BERTopic generates document embedding with pre-trained transformer-based language models, clusters these embeddings, and finally, generates topic representations with the class-based TF-IDF procedure.
https://ritvik19.medium.com/papers-explained-193-bertopic-f9aec10cd5a6