yimmy
banner
yimmymcbill.bsky.social
yimmy
@yimmymcbill.bsky.social
33 followers 150 following 29 posts
trying to learn stats unfortunately through hockey. https://drydan.github.io/
Posts Media Videos Starter Packs
When forming a draft list: the more limited your prospect evaluations the more "factoring in availability to maximize EV" becomes code for "regularizing towards consensus".
Lastly testing sigma_s evolving over time suggests typical development begins to stabilize after draft year + 1. While I haven't tackled any common criticisms, I have also made no progress understanding what's going on under the hood.
An attempt including interactions between league & age in the linear predictor. Notice the contrast between euro & NA leagues. Could be an interesting topic regarding how selection bias across leagues affect estimates. Maybe most of this is alleviated by including deployment info.
I model age curves for F & D using second order random walks. The distinction between a RW1 & RW2 is cute and helpful for smoothing out some fault lines caused by 20 somethings stuck in juniors. Peak is around 26-27 years old.
Here I produce NHLe? by taking the quotient of the exponentiated league coefficients. Some out of sample eval suggests near constant estimates over seasons (likely due to inappropriate choices on my part).
Sealing off this work for now. Popular NHLe models split up estimating league strength and predicting player outcomes. I attempt to do this under one roof in a way that I think is principled… but out of my depth. Am I contributing anything new in the space? Nah.
x_i,u makes more sense! I'll have to correct that. I'm too lazy to show everything this early, but I'll update w/ league coefficients on 2nd draft
NHL Draft model v2. Wrote a rough draft about what I think was going wrong the first go. Not there yet.

drydan.github.io/posts-hockey...
Oh I'm just using it as a cover to avoid some statistical jargon I forgot. It's a coefficient for a players effect on point production given the league, their draft age and position. The coefficients are assumed to update each a season as a random walk, which smooths their estimates year to year.
Results seem not great via smell test, wouldn't take it over other point based methods. Uncertainty around talent feels off w/ flat prior. League estimates very sensitive to cutoff choices. Likes the Q a bit too much. Surely some self inflicted issues but also some road ahead beyond plug & chuggin.
First go at a Poisson SSM for men's hockey. League str, F/D age curves & individuals modelled as RW1s. 14 leagues dating back to 2013 w/ no skater cutoffs. Schaefer an interesting case this year. Would bump to 3rd in draft given a full season and 1st if last season was punched up reasonably.
Top 20 IN-eligible NCAA D1 Skaters by point per game adjusted for opponent's defence. Lot of exceptional players returning.
Top 10 eligible NCAA D1 Defenders by point per game adjusted for opponent's defence. Teammates Winn and Gosling top the charts.
Top 10 eligible NCAA D1 Forwards by point per game adjusted for opponent's defence [very much a WIP result]. O'Brien had a monster year.
PWHL released the 2025 draft eligibility list. Here's some coeffs from a pois reg for a NCAA D1 team level model. Strength of schedule a focus for points based analysis, but most top scorers play tougher games. Wisconsin the kind of super team I wish would declare all at once
My goal is to build a RAPM model using weights instead of binary indicators for players. At this point a sensitivity analysis w/ synthetic shift data would have to convince me to invest more time into cleaning. I'll be shelving this for now, let me know if you make an attempt as I'm very error prone
The lagging player info can create some potentially unrecoverable issues. Sometimes a player will accumulate more TOI in a segment than possible. At the moment I simply let the excess flow into the prior segment. I also end up with roughly 10min of ice time unaccounted for.
The source provides a game clock but it doesn't seem to update in sync w/ the player info. I ended up with 76 unique game times. 32 of them had multiple TOIs for each player. I filtered conflicting snapshots by comparing them to goalie TOIs calculated from goalie change events in the pbp.
Siren's shift data from yesterdays PWHL game derived by recording live game summary updates for each players time on ice every 20 seconds.
Haven't checked manually yet, hopefully the lag exhibits some predictable patterns.
The chart is just early diagnostics at this point, dots are when the player appears in an event, thickness is players toi / elapsed time in that segment. Not too promising so far on my end at least. I'm curious if others have made similar attempts
Source is from: lscluster.hockeytech.com/feed/index.p...
I'm hitting it every 30 seconds during the game. I think this might be the only way to get some info otherwise destroyed once aggregated.
lscluster.hockeytech.com
@mikemurphyhky.bsky.social do you know what's up with PWHL TOI data? I scraped last game's live summary updates to try and create a substitute shift chart but it did not turn out well for me. The final reported TOIs don't appear to add up either. Any work out there on this?
Yeah the nerds are gunna feast on this. Awesome find!
Maybe Reaves for me. Whoever leveraged the PR machine to make themselves completely infallible to enough people. Probably a good locker room guy too