Norm Matloff (你有冇諗清楚呀?)
banner
matloff.bsky.social
Norm Matloff (你有冇諗清楚呀?)
@matloff.bsky.social
Em. Prof., UC Davis. Various awards, incl. book, teaching, public service. Many books, latest The Art of Machine Learning (uses qeML pkg). Former Editor in Chief, the R Journal. Views mine. heather.cs.ucdavis.edu/matloff.html
Reposted by Norm Matloff (你有冇諗清楚呀?)
The Indian government released an MCP server that lets you query their survey data using plain English and get back aggregate statistics from national surveys. I explore some of its possibilities and limitations. #data #stats

aman.bh/blog/2026/qu...
Querying India's MoSPI Data with Claude and MCP | Aman Bhargava
The Ministry of Statistics and Programme Implementation released a new tool for LLM-assisted querying of their survey data. I explore some of it's possibilities.
aman.bh
February 9, 2026 at 10:11 AM
Reposted by Norm Matloff (你有冇諗清楚呀?)
startup idea: wearable device app for automatic medication adherence with machine learning based reminders
February 3, 2026 at 5:12 AM
Reposted by Norm Matloff (你有冇諗清楚呀?)
What’s one data tool or concept you rediscovered in 2025 that surprised you with its usefulness? For me: data.table in R. Lightning-fast joins. Elegant logic.

#RStats #DataScience #MachineLearning
January 16, 2026 at 9:48 PM
Very disappointing, the FDA has now approved Bayesian methods for clinical trials. Someone put pressure on them, likely the drug industry. 🧵 1/
January 16, 2026 at 3:49 AM
Reposted by Norm Matloff (你有冇諗清楚呀?)
Around sunset, Outer Sunset 🌞
December 27, 2025 at 6:58 PM
Reposted by Norm Matloff (你有冇諗清楚呀?)
In all seriousness, it's not til you start reading the utterly bullshit attributions in the papers that cite your work that you realise scientists legitimately cannot read above a grade 6 level
#statsky
My paper: Our validation shows that you should not use this tool as a predictive model.
Papers citing me: The tool has been shown to be a validated predictive model.
#statsky #episky #medsky
December 18, 2025 at 11:11 PM
A post in X marveled at the fact that in very high dimensions, all (mean 0) random vectors are approximately orthogonal. I added the well-known fact that that means they are also approximately equidistant. E.g. length-1 vectors are about sqrt(2) apart. 🧵 1/
December 5, 2025 at 6:54 AM
Reposted by Norm Matloff (你有冇諗清楚呀?)
Logistic regression with 5 million observations.

glm() - about 2 hours

glm(start = ) - less than 2 minutes.

Almost feels too good...

#rstats
December 3, 2025 at 8:12 AM
Very sorry to hear that, one of the giants in #rstats.
#rstats
It is with profound sadness I heard that my long-time friend and colleague, John Fox passed away this week.
He was the author of {car}, {effects}, {Rcmdr}, ... and numerous influential books. I will miss him greatly.
www.john-fox.ca
John Fox: Books and Software
www.john-fox.ca
November 29, 2025 at 11:13 PM
# RStats. Pop quiz:

> x <- matrix(0,nrow=10,ncol=2)
> x[c(2,5,9),] <- 1:2
> x

What is printed out? Why?
November 29, 2025 at 4:16 AM
Reposted by Norm Matloff (你有冇諗清楚呀?)
Applied Machine Learning Using mlr3 in R by Bernd Bischl, Raphael Sonabend, Lars Kotthoff and Michel Lang
#RStats
https://bigbookofr.com/chapters/packages.html#applied-machine-learning-using-mlr3-in-r
November 27, 2025 at 4:30 PM
Reposted by Norm Matloff (你有冇諗清楚呀?)
A friend of mine was hired at a company which uses their own self-hosted AI tools a few weeks ago.
Just today they've told me: "Dude, I think I'm only gonna last 2 months here. Their energy bills make it financially insolvent."
October 8, 2025 at 4:06 PM
Reposted by Norm Matloff (你有冇諗清楚呀?)
Just published my new R article: 'Mapply: When You Need to Iterate Over Multiple Inputs'! 🚀 If `sapply` doesn't quite cut it for your multi-variable iterations, `mapply` is your friend. Learn to pair inputs beautifully. #RStats #Mapply
https://drmo.site/bhXeDb
October 2, 2025 at 1:02 PM
This really is an excellent book.
October 8, 2025 at 2:23 AM
Reposted by Norm Matloff (你有冇諗清楚呀?)
October 2, 2025 at 12:43 PM
Excellent app!
Dr. Who solves the
"should we lump or should we split"
dilemma.
Compare lots of stat methods in this interactive app.
#rshiny #rstats #statistics #rstudio #bayesian #medical #treatment
Lumping&splitting.
Bias&variance.
Individuals&groups.
Borrowing strength
🧪
trials.shinyapps.io/Bias-varianc...
September 26, 2025 at 5:03 PM
Reposted by Norm Matloff (你有冇諗清楚呀?)
Hey

Bike lights

It's fall now
September 23, 2025 at 1:12 AM
Reposted by Norm Matloff (你有冇諗清楚呀?)
Uranus and the Pleiades. 22 September 2025. 🔭 🧪 🎨 #astrophotography #SciArt #photography #StormHour #ThePhotoHour
September 22, 2025 at 10:44 AM
Reposted by Norm Matloff (你有冇諗清楚呀?)
I tell students: You don’t need to memorize every ML algorithm. But you do need to understand:
— loss functions
— overfitting
— regularization
Master the “why,” not just the “how.”

#DataScience #MachineLearning #Statistics #AI #RStats
September 18, 2025 at 6:08 PM
Reposted by Norm Matloff (你有冇諗清楚呀?)
Most companies don’t need deep learning. They need clean joins, honest metrics, and interpretable results.
Startups often chase GPU clusters before fixing their CSVs.

#DataScience #MachineLearning #AI #RStats
September 17, 2025 at 10:17 PM
Reposted by Norm Matloff (你有冇諗清楚呀?)
It’s neat to discover that all these #rstats pocket friends and zoom friends I’ve known for years actually *do* exist in real life #PositConf2025
September 16, 2025 at 11:51 PM
A reminder: When debugging some code and it finally runs and produces output, that doesn't mean the code is now debugged. I myself fell victim to this thinking last week.

I am updating my qeML package, with one change involving a nice aspect of k-Nearest Neighbors: 🧵 1/
September 16, 2025 at 9:28 PM
Reposted by Norm Matloff (你有冇諗清楚呀?)
You can learn more about this project here, only the second of its kind in the United States.

grist.org/energy/calif...

How this is not being done with much more frequency — in light of our fresh water and drought challenges — is baffling.

#energysky #greensky #energytransition
California’s first solar-covered canal is now fully online
The 1.6-megawatt pilot system is among a growing number of initiatives to put solar over waterways. The approach could generate gigawatts of power nationwide.
grist.org
September 13, 2025 at 4:57 PM