Courtney Nash
@courtneynash.bsky.social
400 followers 300 following 100 posts
Internet Incident Librarian. Founder of the VOID. VP of the Resilience in Software Foundation
Posts Media Videos Starter Packs
courtneynash.bsky.social
I tried. They have to want the help. 🤷‍♀️
Reposted by Courtney Nash
courtneynash.bsky.social
Software incidents are painful, and we're trying to help change that. If you deal with incidents in your work, please help us help you!

Take the survey here: www.thevoid.community/survey
VOID Survey
The VOID is a community-contributed collection of software-related incident reports.
www.thevoid.community
courtneynash.bsky.social
If you have anything to do with software incidents in your job and wish things were less chaotic, stressful and burnout-inducing, please help us help you by taking this survey!

www.thevoid.community/survey
courtneynash.bsky.social
Software incidents are painful, and we're trying to help change that. If you deal with incidents in your work, please help us help you!

Take the survey here: www.thevoid.community/survey
VOID Survey
The VOID is a community-contributed collection of software-related incident reports.
www.thevoid.community
courtneynash.bsky.social
As someone who spends a ton of time reading public incident reports, I know full well that we never get a complete or accurate picture of what *really* happened. Keyboard quarterbacking based off that info doesn’t help us as an industry—it certainly isn’t going to push in the direction of more info.
courtneynash.bsky.social
There’s certainly a lot of questions one could ask about this incident but it feels to me like you already had a frame that led to this set. Alternate options include: “What could have made this more difficult for this team?” and “What details aren’t in the RCA for reasons we can’t know?”
courtneynash.bsky.social
This sounds like a joke but I watched a keynote at SREcon this very week where I learned this is how MS is handling the testing for their LLM prompts in Azure… 🤦‍♀️
courtneynash.bsky.social
It will generate its own test cases of course!
courtneynash.bsky.social
People worshipping execs was one of the strangest things from when I worked @ Amazon. People regarded Jeff like a fucking rock star—he said jump they didn’t even ask how high. If I ever pushed back on vague edicts from above, people acted like I was shit-talking the Pope. It was WEIRD. Very cultish.
charity.wtf
No real surprises here, I guess; still, it's shocking to see it up close.

I tend to feel like founders, CEOs and execs are overvalued and overrated. I loathe the genre of reporting that talks about entire companies like the extension of the CEO, the embodiment of one man's will to power. Gross.
Reposted by Courtney Nash
norootcause.surfingcomplexity.com
Using MTTR tends to evaluate if your incident response is improving is like using stock price trends to evaluate if your software development productivity is improving. Sure, it’s an input, but hoo-boy are there a bunch of other factors that move those numbers!
Reposted by Courtney Nash
infoq.com
InfoQ @infoq.com · Mar 5
🎧New on the #InfoQ #podcast!

@courtneynash.bsky.social explores the unintended consequences of automation in software systems, the importance of learning from incidents, and why human expertise remains essential in complex systems.

Listen now: bit.ly/3F5zGlo

#Observability #Monitoring #Resilience
courtneynash.bsky.social
Thank you for sharing it! Glad you enjoyed it, I really do think this is one of the more unexamined (and very important) aspects of incident response (and the later analysis).
courtneynash.bsky.social
People throw this word “algorithm” around like it’s just something someone can create in a snap if they’re smart/tech-savvy. But creating a proper, useful algorithm means you have to understand the way the entire system works in more detail than anyone looking in from the outside possibly could.
atrupar.com
Mike Johnson on Elon Musk: "We meet late into the night in his office and we've looked at that. What he's finding with his algorithms crawling through the data of Social Security system is enormous amounts of fraud, waste, and abuse."
courtneynash.bsky.social
Episode 8 of the VOID podcast is out! Nick and I discuss near misses, what an incident is, and why understanding "normal" work helps us better understand how our systems really work so we're better prepared when the next incident rolls around.

podcast.thevoid.community/1793843/epis...
courtneynash.bsky.social
"What IS an incident?" Tune in tomorrow to find out.
courtneynash.bsky.social
Coming tomorrow, the next episode of the @voidincidents.bsky.social podcast.
courtneynash.bsky.social
Resilience ≠ Reliability
courtneynash.bsky.social
Near misses are also a valuable source of information about how things *actually* work in your org/team, versus how people *think* things work.
courtneynash.bsky.social
Friend or foe? If friend, probably 2 or 3 (because I'll tell you when you're being a dingus), if foe go straight to 10 do not pass go I'm the person people call when they need to scare someone.
letsgoyvie.bsky.social
quote this with how intimidating YOU think you are on a scale of 1 - 10 and let other ppl reply to it with how intimidating THEY think you are >:)
Reposted by Courtney Nash
alexhanna.bsky.social
Spent all day correcting a machine-generated interview transcript. Feel like doing it from scratch may have taken less time?? What are you qualitative people doing these days...
courtneynash.bsky.social
I agree 💯 but at least this is an in-industry confirmation of what we’ve all been saying!
courtneynash.bsky.social
“[A] key irony of automation is that…you deprive the user of the routine opportunities to practice their judgement and strengthen their cognitive musculature, leaving them atrophied and unprepared when the exceptions do arise”

www.404media.co/microsoft-st...
Microsoft Study Finds AI Makes Human Cognition “Atrophied and Unprepared”
Researchers find that the more people use AI at their job, the less critical thinking they use.
www.404media.co