Donald Szlosek
@dszlosek.bsky.social
840 followers 4.2K following 170 posts
Biostatistician @IDEXX formerly at harvardmed, @BIDMChealth, @nasa. Big data, clinical trials, and medical diagnostics. Mainer. Opinions are my own. he/him
Posts Media Videos Starter Packs
Reposted by Donald Szlosek
pwgtennant.bsky.social
Just because an LLM can produce a report with various figures & charts doesn't mean it is good at statistics.

Because good statistics is not about producing code.

It's about deep knowledge of study design & conduct. In my opinion, 95% of all data science problems come from poor questions & design.
hormiga.bsky.social
Y'all. I just got ChatGPT to do everything in R for this manuscript. I mean EVERYTHING. And it's all legit and reproducible. I'm shook.

How are we mentoring our trainees in statistics now? Who needs to learn coding in R line by line, and who doesn't?

scienceforeveryone.science/statistics-i...
Statistics in the era of AI
How do we mentor, teach, and do stats when AI can do so much of the work?
scienceforeveryone.science
Reposted by Donald Szlosek
tslumley.bsky.social
Oh look. X is strongly correlated with rank(X)
#AxesOfEvil
Reposted by Donald Szlosek
Reposted by Donald Szlosek
hormiga.bsky.social
Y'all. I just got ChatGPT to do everything in R for this manuscript. I mean EVERYTHING. And it's all legit and reproducible. I'm shook.

How are we mentoring our trainees in statistics now? Who needs to learn coding in R line by line, and who doesn't?

scienceforeveryone.science/statistics-i...
Statistics in the era of AI
How do we mentor, teach, and do stats when AI can do so much of the work?
scienceforeveryone.science
Reposted by Donald Szlosek
jessicahullman.bsky.social
Think of how much better off we'd be if every established researcher got in the habit of writing papers entitled "Second thoughts on [thing I'm famous for]"
dszlosek.bsky.social
Excellent piece by Miryam Naddaf discussing a surge in papers that likely use LLMs on open data. This is concerning since I work on one of those with some of these datasets (NHANES, CDC WONDERS, BRFSS). www.nature.com/articles/d41...
Reposted by Donald Szlosek
statsepi.bsky.social
An 8 year-old blog post on causal thinking in epidemiology that I'm sharing for no particular reason (ICYMI).

darrendahly.github.io/post/2017-02...
Cause vs. Consequence |
Principal Statistician | Senior Lecturer
darrendahly.github.io
dszlosek.bsky.social
#academics #AcademicSky
dszlosek.bsky.social
Excellent advice on paper review:

1. Peer reviewers are volunteers.

2. Map all comments to actions.

3. Address all comments.

4. Focus on improving your paper, instead of arguing.

5. Rarely, and only with strong defense, say no.

6. Don’t take things personally.

7. Avoid recreational revisions.
dszlosek.bsky.social
S-Values are much more interpretable than P-values, yet adoption seems near impossible. I wonder what it would take to make the leap? #statssky #episky #rstats #statistics
Reposted by Donald Szlosek
tylermw.com
"Man, I really wish RStudio respected hierarchy in code-folded section headers... I wonder how easy it would be to..."

(inner voice: DON'T DO IT! IT'S NOT WORTH IT! JUST GET BACK TO WORK! THE YAK IS BEST LEFT UNSHORN!)

"... I'm gonna do it."

#RStats #RStudio
Reposted by Donald Szlosek
georgiatomova.bsky.social
We should do a study on how much of the funded applied research suffers from problems that the unfunded methods research could have helped prevent or resolve
dszlosek.bsky.social
my personal favorite seed set up i've is set.seed(666) # \m/ rock on #rstats #databs any others out there?
swampthingpaul.bsky.social
While digging through some code from a manuscript I recently read ... yes, that rabbit hole I came across this line and I think I just found my new favorite set.seed(...) 🤣

set.seed(i+42) # Don’t Panic. “What is the meaning of life, the universe, and everything?”

#Rstats
Reposted by Donald Szlosek
pwgtennant.bsky.social
"Uncooperative statistician": the term used (typically by a senior clinician) to describe a well-trained and knowledgeable statistician who refuses to conduct flawed or fraudulent research.
Reposted by Donald Szlosek
andrew.heiss.phd
If you've ever wanted to learn how to make beautiful websites with #QuartoPub and #rstats , check out this workshop I'm giving in a couple weeks! It'll be a blast (and we're covering Quarto's brand new _brand dot yaml system!)
stathorizons.bsky.social
Learn to create and publish a professional, data-focused website in “Create an Online Presence with Quarto Websites” on October 16-17, with @andrew.heiss.phd‬! Discover how to use #Quarto to build a variety of websites like personal portfolios, research compendiums, and interactive dashboards.
Quarto Websites | Online Seminar | Code Horizons
This online course taught by Andrew Heiss, Ph.D., teaches you how to use Quarto to build a variety of data-focused websites.
codehorizons.com
Reposted by Donald Szlosek
hetanshah.bsky.social
Nice chart from @ourworldindata.org showing the contrast between what Americans die of (heart disease and cancer) v what the US media reports on (homicide and terrorism). This naturally leads to it being trickier to build a fact based world view
ourworldindata.org/does-the-new...
What Americans die from
and the causes of death the US media reports on
Causes of death in the US in 2023
Heart disease (29%)
Cancer (26%)
Accidents (9.5%)
Stroke (6.9%)
Lower respiratory diseases
(6.2%)
Alzheimer's disease (4.8%)
Diabetes (4.0%)
Kidney failure (2.4%)
Liver disease (2.2%)
Homicide (<1%)
Terrorism (<0.001%)|
COVID-19 (2.1%)
Influenza/Pneu
monia (19%6)

Media coverage of these causes of death in 2023 in...
The New York Times
The Washington Post
Fox News
Heart disease (2.8%)
Heart disease (2.9%)
Cancer (4.1%)
Cancer (4.7%)
Accidents (5.9%)
Cancer (3.8%)
Accidents (6.1%)
Accidents (9.7%)
Suicide (4.1%)
Suicide (3.3%)
COVID-19 (6.0%)
COVID-19 (7.9%)
Suicide (3.8%)
COVID-19 (5.3%)
Drug overdose (7.5%)
Drug overdose (9.8%)
Drug overdose (9.5%)
Cancer (26%)
Accidents (9.5%)
Stroke (6.9%)
Lower respiratory diseases
(6.2%)
Alzheimer's disease (4.8%)
Diabetes (4.0%)
Kidney failure (24%)
Suicide (2.1%0)
COVID-19 (2.1%0
Homicide (42%)
Homicide (52%)
Homicide (46%)
Terrorism (18%)
Terrorism (12%)
Terrorism (11%)
Homicide (<1%)
Terrorism (<0.001%)
Note: Based on the share of causes of death in the US and the share of mentions for each of the causes in the New York Times, the Washington Post and Fox News. All values are normalized to 100%, so the shares are relative to all deaths caused by the 12 most common causes + drug overdoses, homicides and terrorism. These causes account for more than 75% of deaths in the US.
A "media mention" is a published article in one of the outlets which mentions the cause (e,g. "influenza) or related keywords (e.g. "fu") least twice.
Data sources: Media mentions from Media Cloud (2025): deaths data from the US CDC (2025) and Global Terrorism Index.|
CC BY
dszlosek.bsky.social
If I'm doing a lot of written work, I would get tired of writing the sigma notation or the triangular numbers you suggested. I see this as an easy and fast way of saving my wrists from cramping!
Reposted by Donald Szlosek
solomonkurz.bsky.social
In Ch 19 (nyu-cdsc.github.io/learningr/as...) of his 2nd edition, Kruschke used *residual* SD as a standardizer for group differences from a multilevel ANCOVA. Is there any precedent for using a *residual* SD as a standardizer for a standardized mean difference effect size? #RStats
nyu-cdsc.github.io
dszlosek.bsky.social
I always wondered if there was a shorthand for summation similar to factorials #math #mathsky #statssky #statistics
Reposted by Donald Szlosek
noahgreifer.bsky.social
Thinking about odds ratios...

An odds is a ratio of events to non-events. For example, if the event is survival, the odds of survival is the number of survivors per death. If the event is getting a disease, the odds is the number of diseased individuals per healthy individual.