Tom Smith
@analyst42.bsky.social
290 followers 810 following 84 posts
Code-first data analyst, mostly #rstats. Good information --> good decisions. Head of Activity Analysis & Forecasting at Nottingham University Hospitals NHS Trust. Personal account, views my own. https://github.com/ThomUK
Posts Media Videos Starter Packs
analyst42.bsky.social
Version 0.2.2 of the {NHSRplotthedots} package has today been accepted by CRAN. I have taken over the baton as maintainer, and I'm looking forward to helping more NHS data analysts produce XmR plots of their data, quickly and without fuss. nhsrplotthedots.nhsrcommunity.com
Draw XmR Charts for NHS Making Data Count Programme
Provides tools for drawing Statistical Process Control (SPC) charts. This package supports the NHS Making Data Count programme, and allows users to draw XmR charts, use change points and apply rules w...
nhsrplotthedots.nhsrcommunity.com
Reposted by Tom Smith
bjpaddy.bsky.social
If now that #TDF2025 has ended you don't feel your summer has yet included enough 'people doing impossible things on bikes' then may I recommended the 4000km unsupppored nonstop #TCRno11 from the Atlantic to the Black Sea, which started last night...
Reposted by Tom Smith
andycallow.bsky.social
I set out some personal goals for 2025 and have just completed a review of progress at Q2.

TL;DR: Nailing the cycling. Need to get back on the books and get the research complete.

andy-callow.medium.com/personal-goa...
Personal Goals Review 2025 #2
Here we go again. I set out some personal goals for 2025. It’s now time to review progress at the end of Q2
andy-callow.medium.com
analyst42.bsky.social
Thanks this is really helpful. I had followed links but not recognised the significance of the gcc-ASAN one. I think the C++ code puts this out of my reach, but it is a learning opportunity! I agree it does look like the maintainer has moved on to other things. Thank you!
analyst42.bsky.social
Hi, thanks - I just checked rio out but unfortunately it doesn't support xlsb. The recommended workflow of manually re-saving as xlsx in excel won't work for me because I'm batch-processing about 100 separate files. I'm thankful the readxlsb code is still available on github!
analyst42.bsky.social
Do any #rstats folks know how I can find why {readxlsb} was removed from CRAN? I've forked the github repo, but we use the code at work to read finance spreadsheets over which I have no file format control. Interested in what would be needed to get it back onto CRAN
cranberriesfeed.bsky.social
CRAN removals: aPEAR AssetAllocation CauchyCP copulaedas dartR depower EmbedSOM epiCo fedmatch IP LTRCforests msBP optpart readxlsb rego RFpredInterval rineq rlibkriging stopdetection vines #rstats
analyst42.bsky.social
This applies to the NHS too. Improvement can only be delivered to patients by the front line. It's the reason I choose to work and make a difference in a provider trust, not anywhere more remote from patients.
Reposted by Tom Smith
emollick.bsky.social
I wrote a history of recent AI development in 32 images of otters using wifi on airplanes, from images to video to code.

It shows two big trends: rapid improvements in AI models of all types and the growth of open weights AI models. www.oneusefulthing.org/p/the-recent...
The recent history of AI in 32 otters
Three years of progress as shown by marine mammals
www.oneusefulthing.org
analyst42.bsky.social
The ragnar R package helps you build RAG systems for document information retrieval. You'll need API access to an LLM and an API to make embeddings. Still unfortunately difficult in the NHS, but it will become possible as orgs find and share the news of benefit in specific use-cases.
#databs #nhs
Reposted by Tom Smith
analyst42.bsky.social
I can't read the article, but hopefully they mentioned the improvement context too. I wonder what those in the 1950s would think if we told them that pit stops in 2025 are routinely done in 2 to 3 seconds, not minutes. youtu.be/n_esmAYxE40?...
The Evolution Of F1 Pit-Stops! | DHL
YouTube video by FORMULA 1
youtu.be
Reposted by Tom Smith
erictopol.bsky.social
An exceptional new @nejm.org review on cancer of unknown origin
www.nejm.org/doi/full/10....
Reposted by Tom Smith
alexselbyb.bsky.social
Can you spot
What this chart has got
That modern poems have not?
www.economist.com/culture/2025...
Chart from The Economist showing a drop over time
In poems that rhyme
Reposted by Tom Smith
lawtontri.bsky.social
A thread about using AI to summarise documents; but it could really be about almost any use of AI.

Right now - in healthcare for example - it's a great timesaver for people who don't *need* it, and potentially troublesome for those who may try to use it outside their existing expertise.
craig.nikolic.co.uk
A thread on AI use at work. ChatGPT, summarisers, and so on.

In short, they're really good and powerful tools but many folk are missing the point and harming their own development by over-use.

A thread for both senior folk and those with ambitions to get there.

1/
analyst42.bsky.social
I'm busy reviewing a large number of applications for a position in my team. I'm reflecting on the huge amount of human experience and expertise that is embedded in the applications. Pretty humbling.
Reposted by Tom Smith
dingdingpeng.the100.ci
Thanks to everybody who chimed in!

I arrived at the conclusion that (1) there's a lot of interesting stuff about interactions and (2) the figure I was looking for does not exist.

So, I made it myself! Here's a simple illustration of how to control for confounding in interactions:>
Reposted by Tom Smith
spsanderson.com
🎯 Level up your R functions! Discover best practices for returning multiple values - from simple vectors to structured outputs. Perfect for data pipeline development.

🔗 www.spsanderson.com/steveondata/...

#rstats #Rprog #Rcode #DataSci #R4ds #blog #function #RProgramming
🎯 Level up your R functions! Discover best practices for returning multiple values - from simple vectors to structured outputs. Perfect for data pipeline development.

🔗 https://www.spsanderson.com/steveondata/posts/2025-05-05/

#rstats #Rprog #Rcode #DataSci #R4ds #blog #function #RProgramming
analyst42.bsky.social
Git was 20 years old this week. Written by the creator of Linux as a tool to streamline his workflow - it shows the importance of making tools. If you have a better way of doing something, write and share it. It's possible that few others understand the problem, let alone have the solution you do!
Reposted by Tom Smith
datavisfriendly.bsky.social
#TodayinHistory #dataviz #Onthisday #OTD 📊
💀May 3, 2010 Jacques Bertin died in Paris, France 🇫🇷

In 1967 his Semiology of Graphics became the first
comprehensive theory of graphical symbols and modes of graphics representation --> Grammar of Graphics
Image from Bertin, 1967: A semi-graphic table showing how different kinds of variables can be encoded by various visual aspects--size, value, texture, color, orientation and shape Another representation of the relationship between the nature of variables and visual variables
analyst42.bsky.social
Good advice. "Customer first" cuts across almost every domain. I think Adam Wathan called this "programming by wishful thinking" in one of his TDD talks, and that language has always stuck with me.
joe.codes
If I'm ever stuck when adding a new feature or enhancement to a package, I just start writing documentation for it as if it already exists

The outside → in approach helps me refine the DX and find the weird corners that need to be addressed, then I can start writing the feature based on the docs
Reposted by Tom Smith
t-kalinowski.bsky.social
I really enjoyed chatting with Karin about bridging R and Python. This post is a deep dive into reticulate, rpy2, and what great interoperability really looks like.
#rstats #python
khrovatin.bsky.social
There is no reason to stay bound to one programming language. I discussed ways to ease R-Python interoperability with Luke Zappia, Philipp Angerer, Tomasz Kalinowski.
Their tips and tricks are collected in this blog: hrovatin.github.io/posts/r_pyth...
@lazappi.bsky.social @t-kalinowski.bsky.social
From R to Python with minimal baggage
Getting the best of both worlds.
hrovatin.github.io
Reposted by Tom Smith
andrew.heiss.phd
Here it is with real data! A map of 550,000 foreign aid projects in just a few lines of #rstats code
library(tidyverse)
library(sf)
library(rnaturalearth)
library(here)

# GODAD data from https://godad.uni-goettingen.de/data/
project_level <- read_csv(
  unz(
    here("data", "data-raw", "GODAD.csv.zip"),
    "projectlevel_china_wb.csv"
  )
)

world <- ne_countries(scale = 110, type = "map_units") |> 
  filter(admin != "Antarctica")

projects <- project_level |> 
  st_as_sf(coords = c("longitude", "latitude"), crs = st_crs("EPSG:4326"))

ggplot() +
  geom_sf(data = world) +
  geom_sf(data = projects, size = 0.01, alpha = 0.05) +
  theme_void() A world map of foreign aid projects