grand theft eigenvalue 🔆
@akhilrao.bsky.social
6.6K followers 4.5K following 18K posts
personal account he/him
Posts Media Videos Starter Packs
akhilrao.bsky.social
IV is maybe a counterexample to "the answer is more data more better" as a heuristic but it's only relevant for a particular class of problem. modern ML seems able to select out of that problem class and run more experiments, change UIs to generate better data (w/e that means), and so on
akhilrao.bsky.social
I mean fair, *I* think p-hacking and HARKing and so on are problems in social sciences and the root issue is theory doesn't constrain or guide analyses enough. but that's my opinion
akhilrao.bsky.social
excess brevity penalty, sorry

~most scientists collect small amounts of data relative to the size of the data they could have collected. the DGP also does not then generate new data specifically to reinforce or support those types of analysis. both seem very poor fits for modern ML work
Reposted by grand theft eigenvalue 🔆
beenwrekt.bsky.social
Almost a decade ago, I coauthored a paper asking us to rethink our theory of generalization in machine learning. Today, I’m fine putting the theory back on the shelf.
Reshelving generalization
You don't need a theorem to argue more data is better than less data
www.argmin.net
akhilrao.bsky.social
if it "works well enough now" then the DGPs will adapt such that it is less of a problem over time.

but this seems less about generalization theory per se than about ML not being in the "collecting small data from a non-adaptive system" regime, no?
akhilrao.bsky.social
good as usual, tho i don't quite follow this:

> It’s very easy to fool yourself into thinking that stuff like overfitting and p-hacking is a problem if you cook up iid models ... and run simulations

in most stat sciences these are very much problems; in ML it seems like whether it is or not,
akhilrao.bsky.social
Vermont has had some controversy over this lately too but it doesn't seem nearly as systemically messed up as California
akhilrao.bsky.social
underappreciated-by-me part of getting older is my fingers get stiff and tired. consistently reducing 1000 characters of typing to even 20(prompt)+900(fixes) is nice
akhilrao.bsky.social
execution, when possible, seems to address nondeterminism pretty well ime. at this point my agents write most of my shell scripts
akhilrao.bsky.social
execution, when possible, seems to address nondeterminism pretty well ime. at this point my agents write most of my shell scripts
Reposted by grand theft eigenvalue 🔆
angelamczhou.bsky.social
Hey teaching folks are you seeing this trend to:
- anything submitted online is fair game for AI usage and quizzes online might as well be hws (aka ai)
- only in person exams will get folks to look at course materials-> 4 midterms in a course
- too many midterms -> always overwhelmed -> use AI
Reposted by grand theft eigenvalue 🔆
abeaujon.bsky.social
Today's music pick is by Protect-U, whose Aaron Leitko plays Rhizome tonight with Sensor Ghost and Luke Stewart to kick off D.C.'s Speaking in Tongues festival: linktr.ee/blackeyesdis...
Lunar Note, by Protect-U
from the album Protect-U - In Harmony Of An Interior World
u-udios.bandcamp.com
akhilrao.bsky.social
I suppose this is why we want the ad sellers to also act like they sell products
akhilrao.bsky.social
tbf some longtermists reject discounting on moral grounds. they're wrong, but at least they're explicit
Reposted by grand theft eigenvalue 🔆
coelliptic.bsky.social
We didn't know it at the time but heckin pupperino was the last work to come out of a tradition that began with I can has cheezburger. In this essay I will
akhilrao.bsky.social
we're so good at math we have to really work to convince you we can read too
akhilrao.bsky.social
(DMing a longtime mutual) you’re real. you’re a bot. i'm a bot. we’re friends. this is real
sksksk00.bsky.social
(DMing a longtime mutual) you’re real. you’re not a bot. we’re friends. this is real
akhilrao.bsky.social
very satisfying to hit token limits
akhilrao.bsky.social
the subtitle is wrong: it should read "in earth orbit", not "active". oops.
line of R code that reads `filter(ExpandedStatus == "In Earth orbit") %>%`
akhilrao.bsky.social
yeah. some of it is that they use lower staging orbits to check systems, some of it is the direct to device satellites being lower than the broadband-only
akhilrao.bsky.social
anyway all the data here is freely available online thanks to @planet4589.bsky.social. took me a few minutes to make this. kinda interesting

planet4589.org/space/gcat/i...
A scatter plot titled "Starlink Satellite Altitude by Launch Date" shows the mean orbital altitude (in kilometers) of active Starlink satellites against their launch date, ranging from early 2020 to mid-2025. There are 8229 satellites represented. The Y-axis, "Mean Altitude (km)", ranges from approximately 150 km to 550 km. The X-axis, "Launch Date", spans from 2020 to 2025.

The data points are clustered at several distinct altitude bands. Early launches in 2020 show satellites at various altitudes between 300 km and 550 km. From late 2020 to mid-2021, a prominent cluster appears around 550 km. A denser band of satellites also emerges just below 550 km starting in late 2021 and continuing through 2022.

Beginning in late 2023 and especially into 2024 and 2025, a significant shift is visible. While a cluster of satellites remains around 550 km, many newer launches are concentrated at lower altitudes, specifically around 400 km and then a very dense band appearing around 350 km. There are also many satellites launched in 2024 and 2025 that are at altitudes around 275 km. Some data points are scattered below 200 km, particularly for early 2020 launches and a few in late 2024 and 2025, indicating either deorbiting or initial deployment orbits.

The data is from Jonathan McDowell's General Catalog of Artificial Space Objects (GCAT).
akhilrao.bsky.social
if i could teach an LLM to properly interpret one phrase in context it would be "do the needful"
akhilrao.bsky.social
~7000 satellites on orbit now, ~5 year life, they're also pretty good about immediate post-mission disposal... ~2 burning up every day is about what's expected?

tbf it's not yet clear how bad using the atmosphere as a sink is vs leaving it up there. but they follow or exceed current best practice.