Lightnews — Scholar-powered news

LightNews

Louis Ohl

@louisohl.bsky.social

16 followers 29 following 17 posts

Postdoc @ Linköping University, STIMA division oshillou.github.io

Posts Media Videos Starter Packs

Reposted by Louis Ohl

Pierre-Alexandre Mattei @pamattei.bsky.social · 27d

« Can you train a standard classifier without labels ? » was the question we tried to investigate in this survey. Very happy to see @louisohl.bsky.social ‘s final PhD projet published in ACM CSUR!

Louis Ohl @louisohl.bsky.social · Sep 11

Great news: our "Tutorial on discriminative clustering and mutual information" got upgraded from preprint to ACM Computing Surveys publication! (with @pamattei.bsky.social and Frédéric Precioso).

It is available on open access here: dl.acm.org/doi/10.1145/...

A Tutorial on Discriminative Clustering and Mutual Information | ACM Computing Surveys

To cluster data is to separate samples into distinctive groups that should ideally have some cohesive properties. Today, numerous clustering algorithms exist, and their differences lie essentially in ...

Louis Ohl @louisohl.bsky.social · Sep 11

It is intended for a broad audience from the beginning, and ends with an overview of some of the current deep clustering models. It also features multiple code snippets to get started, even a package!
If you want a historical perspective on discriminative clustering, I hope you'll enjoy reading it.

Louis Ohl @louisohl.bsky.social · Sep 11

This paper explores multiple aspects of discriminative clustering: its global framework, the evolution of the genre from the 90s to today, and how it is deeply intertwined with mutual information.

Louis Ohl @louisohl.bsky.social · Sep 11

Great news: our "Tutorial on discriminative clustering and mutual information" got upgraded from preprint to ACM Computing Surveys publication! (with @pamattei.bsky.social and Frédéric Precioso).

It is available on open access here: dl.acm.org/doi/10.1145/...

A Tutorial on Discriminative Clustering and Mutual Information | ACM Computing Surveys

To cluster data is to separate samples into distinctive groups that should ideally have some cohesive properties. Today, numerous clustering algorithms exist, and their differences lie essentially in ...

Louis Ohl @louisohl.bsky.social · Jul 28

What a pleasure #UAI2025 has been! Great researchers, great talks. I'm looking forward to coming another time!

Thanks a lot for the organisation :-)

Reposted by Louis Ohl

EurIPS Conference @euripsconf.bsky.social · Jul 16

EurIPS is coming! 📣 Mark your calendar for Dec. 2-7, 2025 in Copenhagen 📅

EurIPS is a community-organized conference where you can present accepted NeurIPS 2025 papers, endorsed by @neuripsconf.bsky.social and @nordicair.bsky.social and is co-developed by @ellis.eu

eurips.cc

Louis Ohl @louisohl.bsky.social · May 12

In addition to that historical journey, we provide examples of such milestones and snippets of code to reproduce them on the fly

An example o blob clustering using discriminative methods with few lines of python

Louis Ohl @louisohl.bsky.social · May 12

So how to deal with that? Our tutoria covers the history of genre from the early 90s to modern deep clustering. We show how mutua informztion played a crucial role in its development and present historical milestones we deem relevant.

Louis Ohl @louisohl.bsky.social · May 12

However, learning such a model ks tricky, because common statistical tools do not apply when we assume nothing about the data distribution

Bayes theorem on the distribution between clusters and data. Only the cluster proportion can be estimated. Other components cannot be learnt

Louis Ohl @louisohl.bsky.social · May 12

When doing unsupervised learning, we have two different ways to build our model. One is discriminative: we assume nothing of the data distribution, and try to infer clusters straight out of it. Implicit hyptheses are built within the model

Bayesian graphs depicting generative vs discriminative models

Louis Ohl @louisohl.bsky.social · May 12

This tutorial is intended for both curious readers who know nothing of the genre and a more aware audience.

We hope this tutorial will provide a comprehensive overview, and help develop future research directions for clustering.

So what is it about?

Louis Ohl @louisohl.bsky.social · May 12

New preprint!

Introducing "A tutorial on discriminative clustering and mutual information"

With @pamattei.bsky.social and Frédéric Precioso.

arxiv.org/abs/2505.04484

This preprint covers the history of discriminative clustering with pedagogical intents.

A Tutorial on Discriminative Clustering and Mutual Information

To cluster data is to separate samples into distinctive groups that should ideally have some cohesive properties. Today, numerous clustering algorithms exist, and their differences lie essentially in ...

Louis Ohl @louisohl.bsky.social · May 9

In summary:

DISCOTEC is an easy method to implement that show good ranking performance, and is essentially compatible with all clustering models. It does not require any hyperparameter. (5/5)

Louis Ohl @louisohl.bsky.social · May 9

Since DISCOTEC relies on ensemble, its performance is tied to the number of models used for computing the consensus. This is even stronger for the binarised variant. (4/5)

A screenshot of a figure showing the under simimar conditions, increasing the number of clustering models increases the ranking correlation capabilities of the score

Louis Ohl @louisohl.bsky.social · May 9

An interesting advantage is that binarising the consensus matrix drastically improves the ranking of the clustering algorithms. (3/5)

A screenshot of a table of results where the binarised DISCOTEC exhibits stronger ranking correlation than baselines and competitors

Louis Ohl @louisohl.bsky.social · May 9

We introduce the DISCOTEC score.

It simply consists in two steps: (i) compute the consensus matrix for a set of clustering algorithms (ii) compute the average distance between connectivities and consensus matrices

Bonus: must link and cannot link constraints are gracefully supported (2/5)

This image is a screenshot of pseudo code for computing the DISCOTEC algorithm

Louis Ohl @louisohl.bsky.social · May 9

I'm glad to announce that our paper titled "Discriminative ordering through ensemble consensus" was accepted for a poster presentation at #uai2025.

In collaboration with Fredrik Lindsten.

Preprint: arxiv.org/abs/2505.04464

How to compare sets of very different clustering algorithms? (1/5)

Discriminative Ordering Through Ensemble Consensus

Evaluating the performance of clustering models is a challenging task where the outcome depends on the definition of what constitutes a cluster. Due to this design, current existing metrics rarely han...

Louis Ohl @louisohl.bsky.social · Feb 21

Had a blast today discussing and tracing the evolution of discriminative clustering methods at the Pioneer Centre for AI @aicentre.dk in Copenhagen. Lots of interesting talks and perspectives; thanks for the invitation!

📸: @jesfrellsen.bsky.social

Louis Ohl @louisohl.bsky.social · Feb 13

Happy to announce that we released a new version of GemClus: v1.1.0.

It now includes:
- Compatibility with latest numpy
- A novel gemini based on the chi2 divergence
- An improved introductory documentation for anyone new to the concept

Check it out: gemini-clustering.github.io

Welcome to GemClus documentation! — gemclus 1.1.0 documentation

gemini-clustering.github.io