PromptNER : Prompting For Named Entity Recognition

Publié le 27 août 2023 par loic

Large Language Models (LLMs) and prompt-based heuristics are being used for off-the-shelf solutions to various NLP problems.
LLM-based few-shot methods have shown promise but lag in Named Entity Recognition (NER) compared to other methods.
« PromptNER » is introduced as a new algorithm for few-shot and cross-domain NER.
PromptNER needs entity definitions and few-shot examples for a new NER task.
PromptNER uses LLM to generate potential entities and explanations for their compatibility with entity type definitions.
PromptNER achieves state-of-the-art performance in few-shot NER on ConLL, GENIA, and FewNERD datasets.
It also outperforms previous methods in Cross Domain NER, setting new records on 3 out of 5 CrossNER domains with an average F1 gain of 3%.

https://arxiv.org/pdf/2305.15444.pdf

https://github.com/promptslab/Promptify

The History of Open-Source LLMs: Better Base Models (Part Two)

Publié le 2 août 2023 par loic

https://cameronrwolfe.substack.com/p/the-history-of-open-source-llms-better

Value of Open-source LLM Research: Aims to democratize influential technology; despite initial struggles and criticism, open-source LLMs gained popularity and significance.
Early Challenges: Initial open-source LLMs performed poorly and faced criticism, posing difficulties for advancement.
Transformative Research Line: Focuses on enhancing open-source LLMs, leading to high-performing pre-trained models accessible to all.
Significance of High-Performing Models: Creation of powerful, cost-effective pre-trained LLMs revolutionized research accessibility.
Series Overview: Part two of a three-part series on open-source LLM history. The first part explored initial open-source LLM attempts.
Study Focus: This overview delves into the most popular open-source base models, emphasizing pre-trained models not yet fine-tuned or aligned.
Future Exploration: Subsequent installment will discuss fine-tuning and alignment of models for diverse practical applications.

Practical Prompt Engineering

Publié le 2 août 2023 par loic

https://cameronrwolfe.substack.com/p/practical-prompt-engineering-part

Prompt engineering: An empirical science focused on optimizing LLM (Large Language Model) performance through various prompting strategies.
Aims to understand prompting mechanics and employs techniques to enhance LLM capabilities.
Zero/few-shot learning: A fundamental technique where LLMs perform tasks with minimal or no training examples, showcasing their remarkable adaptability.
Instruction prompting: Another vital technique involving explicit instructions in prompts to guide LLM behavior.
Overview intends to impart practical insights and strategies for effective prompt engineering and LLM utilization.
Provides actionable tricks and takeaways for prompt engineers and LLM practitioners to enhance their effectiveness.

The History of Open-Source LLMs: Early Days (Part One)

Publié le 2 août 2023 par loic

https://cameronrwolfe.substack.com/p/the-history-of-open-source-llms-early

Language modeling research traces back to models like GPT, GPT-2, and pre-transformer methods such as ULMFit.
GPT-3’s proposal marked the initial rise in popularity by showcasing impressive few-shot learning through self-supervised pre-training and in-context learning.
The recognition of GPT-3 led to the creation of various large language models (LLMs), including InstructGPT and ChatGPT, sparking widespread interest in generative AI.
Early LLMs often remained closed source, limiting researchers’ understanding and improvement of their workings.
Open-source variants of popular language models began to emerge gradually, although they initially lagged behind proprietary models in performance.
These early open-source models laid the groundwork for increased transparency in LLM research and inspired the development of more potent subsequent models like Falcon and LLaMA-21.
The overview is part of a three-part series that delves into the history of open-source language models, exploring their beginnings, recent developments, and the application of imitation and alignment techniques to enhance their performance.

Cleaning labels: Cleanlab

Publié le 11 mars 2023 par loic

cleanlab automatically detects problems in a ML dataset. This data-centric AI package facilitates machine learning with messy, real-world data by providing clean labels for robust training and flagging errors in your data

Paper: https://arxiv.org/pdf/1911.00068.pdf

Code : Code

Yellowbrick: Machine Learning Visualization

Publié le 11 mars 2023 par loic

Feature Visualization

Rank Features: pairwise ranking of features to detect relationships
Parallel Coordinates: horizontal visualization of instances
Radial Visualization: separation of instances around a circular plot
PCA Projection: projection of instances based on principal components
Manifold Visualization: high dimensional visualization with manifold learning
Joint Plots: direct data visualization with feature selection

Classification Visualization

Class Prediction Error: shows error and support in classification
Classification Report: visual representation of precision, recall, and F1
ROC/AUC Curves: receiver operator characteristics and area under the curve
Precision-Recall Curves: precision vs recall for different probability thresholds
Confusion Matrices: visual description of class decision making
Discrimination Threshold: find a threshold that best separates binary classes

Regression Visualization

Prediction Error Plot: find model breakdowns along the domain of the target
Residuals Plot: show the difference in residuals of training and test data
Alpha Selection: show how the choice of alpha influences regularization
Cook’s Distance: show the influence of instances on linear regression

Clustering Visualization

K-Elbow Plot: select k using the elbow method and various metrics
Silhouette Plot: select k by visualizing silhouette coefficient values
Intercluster Distance Maps: show relative distance and size/importance of clusters

Model Selection Visualization

Validation Curve: tune a model with respect to a single hyperparameter
Learning Curve: show if a model might benefit from more data or less complexity
Feature Importances: rank features by importance or linear coefficients for a specific model
Recursive Feature Elimination: find the best subset of features based on importance

Target Visualization

Balanced Binning Reference: generate a histogram with vertical lines showing the recommended value point to bin the data into evenly distributed bins
Class Balance: see how the distribution of classes affects the model
Feature Correlation: display the correlation between features and dependent variables

Text Visualization

Term Frequency: visualize the frequency distribution of terms in the corpus
t-SNE Corpus Visualization: use stochastic neighbor embedding to project documents
Dispersion Plot: visualize how key terms are dispersed throughout a corpus
UMAP Corpus Visualization: plot similar documents closer together to discover clusters
PosTag Visualization: plot the counts of different parts-of-speech throughout a tagged corpus

AI Factory

Publié le 5 mars 2023 par loic

Text using Chatgpt, image from Dall-E, text to speech from D-ID

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

Publié le 26 février 2023 par loic

The main idea in the paper is that the performance of regular Multi-layer Perceptron (MLP) can be significantly improved if we use Transformers to transforms regular categorical embeddings into contextual ones.

The TabTransformer is built upon self-attention based Transformers. The Transformer layers transform the embed- dings of categorical features into robust contextual embed- dings to achieve higher prediction accuracy.