Lightnews — Scholar-powered news

Soledad Galli, PhD

@solegalli.bsky.social

Discover the latest thoughts on working with imbalanced data with our free booklet.

We discuss 3 recent articles that have changed the conversation on resampling and SMOTE👇

www.trainindata.com/p/7-takes-on...

October 27, 2025 at 12:30 PM

Soledad Galli, PhD

@solegalli.bsky.social

All our courses come with a 30-Day money back guarantee...

If you are unhappy for whatever reason, we give you the money back.

That confident we are that you'll ❤️ our courses.

#trainindata

October 24, 2025 at 11:28 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Six Cloud Platforms to Run Jupyter Notebooks for Free 🚀

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/bltkmoeitj

#machinelearning #datascience #jupyter #mlmodels #ML #mltools #notebooks #cloudplatforms

August 29, 2025 at 10:02 AM

Soledad Galli, PhD

@solegalli.bsky.social

👉MICE is a powerful method for datasets with missing data across multiple variables.

Let this slide guide you through how it works.

#machinelearning #MICE #mlmodels #datascience #dataengineering #imputation #featureengineering

August 27, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

How to construct ensembles from a thousand models?

In this article, Caruana, a prominent figure in machine learning and ensemble methods, tells us more about how they create ensembles from libraries of 1000s of machine learning models.
📄 https://f.mtr.cool/fpaqqnqxms

August 26, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Clustering & Dimensionality Reduction: your toolkit for finding patterns, simplifying data, and solving real-world problems.

🔍 You’ll:
✅ Group data (K-means, DBSCAN & more)
✅ Reduce complexity (PCA, UMAP)
✅ Work on real cases like RNA profiling

📍 https://f.mtr.cool/hdjiwbbsbl

August 25, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Working with imbalanced data? Follow these 3 steps.

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/svpfklfpda

#machinelearning #datascience #CV #mlmodels #ML #MLCareer #MLresume

August 22, 2025 at 10:02 AM

Soledad Galli, PhD

@solegalli.bsky.social

Can we use statistical tests to select features? 🤔

Turns out, we can! 🎉

In the slides below, we’ll explore the most commonly used statistical tests for feature selection, along with their advantages and limitations. 👇

#machinelearning #datascience #featureselection

August 19, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚨 It’s here! Our new course on Clustering & Dimensionality Reduction just dropped 🎉

Learn how to group data (K-Means, DBSCAN, Louvain) + simplify it with PCA & UMAP, no prior experience needed!

Hands-on & practical 👇
👉 https://f.mtr.cool/zshxexbrds

#MachineLearning #DataScience

August 18, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : How to Write a Winning Data Science CV

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/nozrfuruar

#machinelearning #datascience #CV #mlmodels #ML #MLCareer #MLresume

August 15, 2025 at 10:02 AM

Soledad Galli, PhD

@solegalli.bsky.social

Deep learning has transformed our daily lives, but designing neural networks remains a challenge.

Automated hyperparameter optimization (HPO) streamlines the process. This paper reviews key techniques & tools for improving model accuracy & efficiency.
📃https://f.mtr.cool/wowjcrmwjg

August 14, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

In case you were wondering 👇

#machinelearning #ai #datascience #dataengineering #mlmodels

August 13, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚨 SMOTE has long been hailed as the go-to solution for imbalanced datasets, but it only works in specific scenarios.

In this article, we explore when SMOTE is truly effective & why it’s remained popular.

Check it out!
https://f.mtr.cool/medbbpfril

August 12, 2025 at 4:01 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚨 Just launched: our new course on Clustering & Dimensionality Reduction is live at Train in Data!

Learn to group data, reduce complexity with PCA & UMAP, and tackle real-world projects (no experience needed!)

🎓 Join us: https://f.mtr.cool/wlhxbboqkl

August 11, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Everybody says “SMOTE does not work”.

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/pinchbaedf

#machinelearning #datascience #smote #mlmodels #ML

August 8, 2025 at 10:01 AM

Soledad Galli, PhD

@solegalli.bsky.social

🐍Python libraries that implement agnostic global explainability methods 👇

#python #machinelearning #MLModel #datascience #dataengineering

August 6, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Most commonly used encoding techniques ⬇️

1. OneHotEncoder
2. OrdinalEncoder
3. TargetEncoder

When one-hot encoding gets too complex and ordinal encoding leads to inaccuracies, TargetEncoding often becomes the best choice. Learn more at the link below.

#targetencoder #ML

August 5, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚨 New Course - Clustering & Dimensionality Reduction at Train in Data

Learn to apply unsupervised ML in practice 👇
✅ K-Means, DBSCAN, HDBSCAN, Graph-based
✅ PCA & UMAP
✅ Real-world projects incl. RNA case study

Find out more : https://f.mtr.cool/cojxgkyhgq

August 4, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Probe Feature Selection

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/xefqrzzgeh

#machinelearning #datascience #imbalanceddata #undersampling #mlmodels #ML

August 1, 2025 at 10:02 AM

Soledad Galli, PhD

@solegalli.bsky.social

🤔 Have you used missing category imputation in your projects? Check out this reel 👇

💡 Want to dive deeper into feature engineering and data imputation? Check out our course
https://www.trainindata.com/p/feature-engineering-for-machine-learning

#machinelearning #featurenegineering #dataimputation

July 29, 2025 at 4:03 PM

Soledad Galli, PhD

@solegalli.bsky.social

In #ML, the accuracy of a classifier’s predictions is crucial. If your model's probabilities are off, probability calibration can correct that.✔️

Learn why calibration matters & how to do it in Python with scikit-learn 👇 https://www.blog.trainindata.com/probability-calibration-in-machine-learning/

July 28, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Tired of spending hours on data preprocessing?

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/lyojjydmkp

July 25, 2025 at 10:01 AM

Soledad Galli, PhD

@solegalli.bsky.social

Machine Learning is transforming insurance, but black-box models hurt trust and compliance. 🧐

Interpretability helps us:
✅ Spot biases
✅ Explain decisions
✅ Improve models

Understanding decisions = fairer, more transparent insurance. 💡

#MachineLearning #Insurance #AI

July 24, 2025 at 4:03 PM

Soledad Galli, PhD

@solegalli.bsky.social

📊 AUC-ROC analysis is a reliable metric for binary classification, helping to assess class differentiation, even with imbalanced datasets.

Check out this blog that breaks down its key concepts and shows how to evaluate #ML model performance.👇
https://f.mtr.cool/ravvrkjudz

July 22, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚀 Exciting news! Our new course on Clustering & Dimensionality Reduction is live at Train in Data! 🎉

Learn to group data & simplify datasets with hands-on projects—no experience needed. Let’s grow your ML skills together!

👉 https://f.mtr.cool/myghjmekoa

July 21, 2025 at 4:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news