Lightnews — Scholar-powered news

Soledad Galli, PhD

@solegalli.bsky.social

Moving averages has been long used as a forecasting benchmark model.

Did you know that you can also use moving averages as input features?

If not, check out this blog to find out more, together with Python implementations:

www.blog.trainindata.com/master-movin...

Moving Average Forecasting: What You Need to Know - Train in Data's Blog

Learn moving average forecasting with clear examples, practical applications, and accuracy tips for better time series predictions.

www.blog.trainindata.com

November 3, 2025 at 12:30 PM

Soledad Galli, PhD

@solegalli.bsky.social

Discover the latest thoughts on working with imbalanced data with our free booklet.

We discuss 3 recent articles that have changed the conversation on resampling and SMOTE👇

www.trainindata.com/p/7-takes-on...

October 27, 2025 at 12:30 PM

Soledad Galli, PhD

@solegalli.bsky.social

All our courses come with a 30-Day money back guarantee...

If you are unhappy for whatever reason, we give you the money back.

That confident we are that you'll ❤️ our courses.

#trainindata

October 24, 2025 at 11:28 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Six Cloud Platforms to Run Jupyter Notebooks for Free 🚀

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/bltkmoeitj

#machinelearning #datascience #jupyter #mlmodels #ML #mltools #notebooks #cloudplatforms

August 29, 2025 at 10:02 AM

Soledad Galli, PhD

@solegalli.bsky.social

Imbalanced datasets can mess with your ML models. 😬
ADASYN (Adaptive Synthetic Sampling) to the rescue! 🚀

Learn how it works + when to use it in our latest blog 👇
https://f.mtr.cool/rqstrumpnx

#MachineLearning #DataScience #ImbalancedData #ADASYN

ADASYN: Adaptive Synthetic Sampling for Imbalanced Datasets - Train in Data's Blog

ADASYN can be used to handle data imbalance by creating synthetic samples of the minority class and improve model performance. Really?

f.mtr.cool

August 28, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

👉MICE is a powerful method for datasets with missing data across multiple variables.

Let this slide guide you through how it works.

#machinelearning #MICE #mlmodels #datascience #dataengineering #imputation #featureengineering

August 27, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

How to construct ensembles from a thousand models?

In this article, Caruana, a prominent figure in machine learning and ensemble methods, tells us more about how they create ensembles from libraries of 1000s of machine learning models.
📄 https://f.mtr.cool/fpaqqnqxms

August 26, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Clustering & Dimensionality Reduction: your toolkit for finding patterns, simplifying data, and solving real-world problems.

🔍 You’ll:
✅ Group data (K-means, DBSCAN & more)
✅ Reduce complexity (PCA, UMAP)
✅ Work on real cases like RNA profiling

📍 https://f.mtr.cool/hdjiwbbsbl

August 25, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Working with imbalanced data? Follow these 3 steps.

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/svpfklfpda

#machinelearning #datascience #CV #mlmodels #ML #MLCareer #MLresume

August 22, 2025 at 10:02 AM

Soledad Galli, PhD

@solegalli.bsky.social

Model performance matters! 🎯

In this article, we break down essential evaluation metrics for classification models, starting with the Confusion Matrix. Perfect for anyone looking to build reliable #machinelearning systems!

Have a good read👇

Confusion Matrix, Precision, and Recall - Train in Data's Blog

Find out what the confusion matrix is and how it relates to other classification metrics like precision, recall and f1-score.

f.mtr.cool

August 21, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

ELI5 now supports scikit-learn 1.6.0! 🎉It wasn’t working with the latest version of scikit-learn, but that’s a thing of the past.

As of now, ELI5 has released a new version with full support for scikit-learn >1.6.0 and Python >3.10.

Check it out 👇

GitHub - eli5-org/eli5: A library for debugging/inspecting machine learning classifiers and explaining their predictions

A library for debugging/inspecting machine learning classifiers and explaining their predictions - eli5-org/eli5

f.mtr.cool

August 20, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Can we use statistical tests to select features? 🤔

Turns out, we can! 🎉

In the slides below, we’ll explore the most commonly used statistical tests for feature selection, along with their advantages and limitations. 👇

#machinelearning #datascience #featureselection

August 19, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚨 It’s here! Our new course on Clustering & Dimensionality Reduction just dropped 🎉

Learn how to group data (K-Means, DBSCAN, Louvain) + simplify it with PCA & UMAP, no prior experience needed!

Hands-on & practical 👇
👉 https://f.mtr.cool/zshxexbrds

#MachineLearning #DataScience

August 18, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : How to Write a Winning Data Science CV

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/nozrfuruar

#machinelearning #datascience #CV #mlmodels #ML #MLCareer #MLresume

August 15, 2025 at 10:02 AM

Soledad Galli, PhD

@solegalli.bsky.social

Deep learning has transformed our daily lives, but designing neural networks remains a challenge.

Automated hyperparameter optimization (HPO) streamlines the process. This paper reviews key techniques & tools for improving model accuracy & efficiency.
📃https://f.mtr.cool/wowjcrmwjg

August 14, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

In case you were wondering 👇

#machinelearning #ai #datascience #dataengineering #mlmodels

August 13, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚨 SMOTE has long been hailed as the go-to solution for imbalanced datasets, but it only works in specific scenarios.

In this article, we explore when SMOTE is truly effective & why it’s remained popular.

Check it out!
https://f.mtr.cool/medbbpfril

August 12, 2025 at 4:01 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚨 Just launched: our new course on Clustering & Dimensionality Reduction is live at Train in Data!

Learn to group data, reduce complexity with PCA & UMAP, and tackle real-world projects (no experience needed!)

🎓 Join us: https://f.mtr.cool/wlhxbboqkl

August 11, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Everybody says “SMOTE does not work”.

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/pinchbaedf

#machinelearning #datascience #smote #mlmodels #ML

August 8, 2025 at 10:01 AM

Soledad Galli, PhD

@solegalli.bsky.social

In this video, I review hyperparameter optimization techniques like Grid Search, Random Search, & Bayesian methods.

Learn their pros, cons, and best applications for both low and high-dimensional spaces!

What techniques do you use?
📽️

Enjoy the videos and music that you love, upload original content and share it all with friends, family and the world on YouTube.

f.mtr.cool

August 7, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

🐍Python libraries that implement agnostic global explainability methods 👇

#python #machinelearning #MLModel #datascience #dataengineering

August 6, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Most commonly used encoding techniques ⬇️

1. OneHotEncoder
2. OrdinalEncoder
3. TargetEncoder

When one-hot encoding gets too complex and ordinal encoding leads to inaccuracies, TargetEncoding often becomes the best choice. Learn more at the link below.

#targetencoder #ML

August 5, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

🚨 New Course - Clustering & Dimensionality Reduction at Train in Data

Learn to apply unsupervised ML in practice 👇
✅ K-Means, DBSCAN, HDBSCAN, Graph-based
✅ PCA & UMAP
✅ Real-world projects incl. RNA case study

Find out more : https://f.mtr.cool/cojxgkyhgq

August 4, 2025 at 4:02 PM

Soledad Galli, PhD

@solegalli.bsky.social

Next Monday on Data Bites : Probe Feature Selection

Want to know more?

Click the link below to subscribe and stay tuned!👇
https://f.mtr.cool/xefqrzzgeh

#machinelearning #datascience #imbalanceddata #undersampling #mlmodels #ML

August 1, 2025 at 10:02 AM

Soledad Galli, PhD

@solegalli.bsky.social

The most crucial component of any machine learning project is data!

▶️ 90% of the time is spent on data preprocessing
▶️ 10% of the time is spent on model building, tuning and evaluation.

#machinelearning #ML #MLmodels #preprocessing #modelbuilding #datascience

July 31, 2025 at 4:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news