Deb Raji
@rajiinio.bsky.social
10K followers 66 following 130 posts
AI accountability, audits & eval. Keen on participation & practical outcomes. CS PhDing @UCBerkeley.
Posts Media Videos Starter Packs
rajiinio.bsky.social
There is so much about navigating the Internet in a low resourced language that makes one unnecessarily vulnerable to malicious actors. It's not just a quality of experience difference, but literally the soft belly through which misinformation spreaders attack.
hellinanigatu.bsky.social
Very excited for our upcoming #AIES paper Into the Void: Understanding Online Health Information in Low-Web Data Languages.

Link: arxiv.org/pdf/2509.20245

1/n
arxiv.org
rajiinio.bsky.social
More of this kind of reporting, please!
Reposted by Deb Raji
leonieclaude.bsky.social
So inspiring to see @karenhao.bsky.social in conversation with @rajiinio.bsky.social at the AI and Society event @ Berkeley. We need more formats like this in the Bay Area! (and beyond)
rajiinio.bsky.social
😂😂 lol why is this interaction so on brand
rajiinio.bsky.social
I get similar emails as well! Honestly, very strange ...
rajiinio.bsky.social
Hm, that's interesting. What's the scientific hypothesis tested with for example Imagenet? The hold out method?
rajiinio.bsky.social
I think there's an interesting normative difference revealed in this spat - namely that there's an unfairness to infinite versioning and selective disclosure, which impacts *leaderboard placement* (ie. which models we see on top as number 1) but might still be statistically sound evaluation practice
Reposted by Deb Raji
emmharv.bsky.social
Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling by @victorojewale.bsky.social @rbsteed.com @briana-v.bsky.social @abeba.bsky.social @rajiinio.bsky.social compares the landscape of AI audit tools (tools.auditing-ai.com) to the actual needs of AI auditors.
Screenshot of paper title and author list: 

Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling
Victor Ojewale, Ryan Steed, Briana Vecchione, Abeba Birhane, Inioluwa Deborah Raji
rajiinio.bsky.social
Emma has such good research taste :)

Given the sheer scale of these events, its really helpful to see what caught people's eye at these conferences...
emmharv.bsky.social
After having such a great time at #CHI2025 and #FAccT2025, I wanted to share some of my favorite recent papers here!

I'll aim to post new ones throughout the summer and will tag all the authors I can find on Bsky. Please feel welcome to chime in with thoughts / paper recs / etc.!!

🧵⬇️:
Reposted by Deb Raji
emmharv.bsky.social
After having such a great time at #CHI2025 and #FAccT2025, I wanted to share some of my favorite recent papers here!

I'll aim to post new ones throughout the summer and will tag all the authors I can find on Bsky. Please feel welcome to chime in with thoughts / paper recs / etc.!!

🧵⬇️:
rajiinio.bsky.social
Was beyond disappointed to see this in the AI Action Plan. Messing with the NIST RMF (which many private & public institutions currently rely on) feels like a cheap shot
rajiinio.bsky.social
The way this was predictable from the start...
thomasfuchs.at
The purpose of a system is what it does
Headline about FDA “AI” making up studies for drug approvals
rajiinio.bsky.social
This group (+ @leonyin.org ) makes this team like the Avengers of data journalists lol

Congrats to Bloomberg!
dmehro.bsky.social
Friday was my last day at WIRED. Today I started a new job at Bloomberg on a dream desk with @suryamattu.com and @jeffykao.bsky.social.

Got a tip? A dataset? Something we should look at? Find me on Signal at dmehro.89

In no particular order, here’s some stuff I’d like to continue to dig in to:
Reposted by Deb Raji
himself.bsky.social
The genius of the Wired people, as I see it from outside, is that they very quickly saw how a specific model of _tech_ reporting could be much more easily adapted into a closely related model of _political_ reporting, in a world where actual and organizational technology is urgently relevant.
rajiinio.bsky.social
After my participation in the AI Senate forum, Andreesen went out of his way to find me on X and block me (I had never interacted w him).

Just a terrifyingly hateful person. I remain deeply unsettled the more I learn about how deep his prejudice actually goes.
rajiinio.bsky.social
I've always felt somewhat uncomfortable with the framing of AI risk around the actions of "malicious actors". Because sometimes the malicious actor is the company that built the thing. And the model is causing harm because it was successfully steered into doing what it's creators wanted it to do.
rajiinio.bsky.social
The distinction between this situation & the 2016 Tay twitter chatbot fiasco by Microsoft feels like a sign of the times - now, it's the companies actively trying to steer the chatbot towards extremist views rather than the interactions of a malicious public pushing the chatbot in that direction...
cwarzel.bsky.social
the 🐐 @matteowong.bsky.social and i tried to explain some of technical reasons for why grok went nazi as well as some of the less technical ones (prompting Grok to use X posts as a primary source and rhetorical inspiration). It's all quite awful and illuminating www.theatlantic.com/technology/a...
Grok, as Musk and xAI have designed it, is fertile ground for showcasing the worst that chatbots have to offer. Musk has made it no secret that he wants his large language model to parrot a specific, anti-woke ideological and rhetorical style that, while not always explicitly racist, is something of a gateway to the fringes. By asking Grok to use X posts as a primary source and rhetorical inspiration, xAI is sending the large language model into a toxic corpus where trolls, political propagandists, and outright racists are some of the loudest voices. Musk himself seems to abhor guardrails generally—except in cases where guardrails help him personally—preferring to hurriedly ship products, rapid unscheduled disassemblies be damned. That may be fine for an uncrewed rocket, but X has hundreds of millions of users aboard.

For all its awfulness, the Grok debacle is also clarifying. It is a look into the beating heart of a platform that appears to be collapsing under the weight of its worst and loudest users. Musk and xAI have designed their chatbot to be a mascot of sorts for X—an anthropomorphic layer that reflects the platform’s ethos. They’ve communicated their values and given it clear instructions. That the machine has read them and responded by turning into a neo-Nazi speaks volumes.
rajiinio.bsky.social
Instead of single words or phrases, users will immediately attempt multi-hop searches, incorporate literally paragraphs of context, etc lol completely unprecedented search behavior afaik

Interestingly, current benchmarks like SimpleQA don't reflect this & still feature the one liner format
rajiinio.bsky.social
One of the first things I noticed looking through SearchArena logs was the very stark differences between queries for search LLMs (eg. Perplexity AI, etc) and regular search engines.

It really is quite remarkable how different people's expectations are!
emollick.bsky.social
An example of the type of search (would require reading multiple sites, balancing multiple constraints) where o3/Gemini 2.5 Pro has completely replaced Google for me.
Reposted by Deb Raji
jessica.bsky.social
individual reporting for post-deployment evals — a little manifesto (& new preprints!)

tldr: end users have unique insights about how deployed systems are failing; we should figure out how to translate their experiences into formal evaluations of those systems.
Reposted by Deb Raji
willoremus.com
In a stunning reversal, the Senate voted 99-1 this morning to strip from Trump's big bill a 10-year moratorium on state-level AI regulations.

Gift link to my story on how it happened and who's celebrating: wapo.st/3TOyiaG
In dramatic reversal, Senate votes to kill AI-law moratorium
A GOP-led bid to stop states from regulating AI collapsed after a deal to save it fell through.
www.washingtonpost.com