Visually Understanding UMAP

In this article, they explore dimensionality reduction, a valuable tool for machine learning practitioners aiming to analyze vast, high-dimensional datasets. While t-SNE is a commonly used technique for visualization, its efficacy diminishes with large datasets and mastering its application can be challenging.

UMAP, introduced by McInnes et al., presents several advantages over t-SNE, including enhanced speed and better preservation of a dataset’s global structure. This article delves into the theory behind UMAP, providing insights into its functionality, effective usage, and a performance comparison with t-SNE.

https://pair-code.github.io/understanding-umap/

TabTransformer: Tabular Data Modeling Using Contextual Embeddings

The main idea in the paper is that the performance of regular Multi-layer Perceptron (MLP) can be significantly improved if we use Transformers to transforms regular categorical embeddings into contextual ones.

The TabTransformer is built upon self-attention based Transformers. The Transformer layers transform the embed- dings of categorical features into robust contextual embed- dings to achieve higher prediction accuracy.