Lightnews — Scholar-powered news

Aaron Sterling

@aaronsterling.bsky.social

440 followers 1.5K following 710 posts

CEO, Thistleseeds. Personal account. Current primary project: tech for substance use disorder programs.

Posts Media Videos Starter Packs

Pinned

Aaron Sterling @aaronsterling.bsky.social · Mar 11

My Bluesky-to-real-life collaborations:

1. we just hired the extremely talented @drawimpacts.bsky.social for a design project

2. The estimable @roxanadaneshjou.bsky.social
and I wrote a rapid response to a BMJ (British Medical Journal) article. www.bmj.com/content/387/...

The BMJ: browse by volume/issue, medical specialty or clinical topic

Full archive, searchable by print issue, specialty, clinical & non-clinical topic. See also for podcasts, videos, infographics, blogs, reader comments, print cover images

www.bmj.com

1 9

Aaron Sterling @aaronsterling.bsky.social · 12m

Lonely women have money. Also: sidestepping it is tricky. In the last week, I've talked to two different lawyers about what to do if we ask someone to upload a photo and they upload their dick. Ignore? Laugh along? Admonish? Is behavior different if chatbot avi is male/female? It's a new world.

Aaron Sterling @aaronsterling.bsky.social · 56m

In January, I thought there was at least a 51% chance of HIPAA being de facto (and maybe de jure) repealed, motored by the interest in access to reproductive health data. I think that chance is higher now, though not clear whether large health data vendors will change behavior even if it happens.

Reposted by Aaron Sterling

Tim Duffy @timfduffy.com · 5h

Notes on the Haiku 4.5 system card: assets.anthropic.com/m/12f214efcc...

Anthropic is releasing it as ASL-2, unlike Sonnet 4.5/Opus 4+ which are considered ASL-3

1 1 8

Aaron Sterling @aaronsterling.bsky.social · 6h

This looks great!

Reposted by Aaron Sterling

Mijail Santos @mijailsantos.bsky.social · 22h

📰Artificial Intelligence could transform health care— if we get it right: WHO calls for more equitable use of AI in the Western Pacific.
@whowpro.bsky.social

reliefweb.int/report/world...

Artificial Intelligence could transform health care— if we get it right: WHO calls for more equitable use of AI in the Western Pacific - World

News and Press Release in English on World about Health; published on 13 Oct 2025 by WHO

reliefweb.int

1 2

Aaron Sterling @aaronsterling.bsky.social · 1d

"Boyfriend" is important. Women spend more per OnlyFans transaction per capita than men do. More men subscribe to OF than women, of course. But perhaps the text/story primacy of ChatGPT will mean there are more female paid users than male, while OF is more visual.

Aaron Sterling @aaronsterling.bsky.social · 1d

There's a months-long Trust and Safety whackamole where goon communities post porn jailbreak prompts to Github, OpenAI renders the jailbreak ineffective, and the cycle continues. Might be easier to manage if there's an official front door.

Reposted by Aaron Sterling

Werner @werner.social · 1d

No data, no AI, no progress. My @AmazonScience article explores how multi-layered mapping + petabyte-scale cloud infrastructure helps save lives in time of crisis. Building AI without addressing the fundamental data divide means solving the wrong problems. amazon.science/blog/why-ai-...

Why AI for good depends on good data

New technologies are helping vulnerable communities produce maps that integrate topographical, infrastructural, seasonal, and real-time data — an essential tool for many humanitarian endeavors.

amazon.science

1 4 9

Aaron Sterling @aaronsterling.bsky.social · 1d

My "original proposal" was a recommendation to an experienced medical researcher. Research maturity gives you the ability to prima facie check that a methodology appears evidence-based. Hire a subject matter expert to verify, if you want to deploy to clinic. (My last post here, best wishes.)

Aaron Sterling @aaronsterling.bsky.social · 1d

What you describe is better done by humans. The advantage of LLMs is the ability to do the same work at scale. To find useful references in a field you have no experience in, to process petabytes of health data, etc.

Aaron Sterling @aaronsterling.bsky.social · 1d

That is a proven technique to receive more useful responses: fewer errors, and the errors that exist are easier to spot. You need to check work, just as you would with a human assistant. It's much like supervising someone with a lot of intelligence and ego, but little practical experience.

Aaron Sterling @aaronsterling.bsky.social · 1d

I'm not having a rhetorical discussion; I'm telling you what's happening in real life. Risk Management departments are managing LLM error similarly to how they manage human error. Legal red teaming is a tool for this. Management of risk is possible; elimination is impossible, and always has been.

Aaron Sterling @aaronsterling.bsky.social · 1d

One of the core techniques of legal red teaming is to depose models. Think of it like a variation of moot court. Even when attorneys are not directly involved in such depositions (as that particular group often is), it's a useful metaphor, as discussed in the screenshot I posted earlier.

Aaron Sterling @aaronsterling.bsky.social · 1d

A cautious use is: cover the attached file with automated tests according to (project testing standards document). You can get tests for dozens of weird edge cases in a few minutes. Can specify unit tests, integration tests, etc., depending on if you want dependencies to be run or mocked.

Aaron Sterling @aaronsterling.bsky.social · 1d

Creators and users of LLMs potentially could. My own work with LLMs is bound by HIPAA and a "super-HIPAA" requirement called 42 CFR Part 2. A negligent data breach could cause the negligent person to face prison time. That's part of what motivates deposition of LLMs: to reduce corporate legal risk.

Aaron Sterling @aaronsterling.bsky.social · 1d

Like any powerful tool, it's easy to use incorrectly. It requires a fair amount of training and time investment to write strong prompts, to write tests that verify accuracy of those prompts, etc. Some of my prompts are multi-page standards documents.

Aaron Sterling @aaronsterling.bsky.social · 1d

Most software programmers are not managers. Good prompting requires skills very similar to writing requirements for contract work. It's a learnable skill, but most people using LLMs have not yet learned it.

Aaron Sterling @aaronsterling.bsky.social · 2d

I have a meeting in three minutes. But here is one white paper that appeared in that search. I attended a talk by these folks earlier this year. Teams of attorneys and data scientists depose LLMs as part of a legal red team. www.jdsupra.com/legalnews/le...

Legal Red Teaming, One Year In: Reports from the Field | JD Supra

Introduction - In our June 2024 white paper, Legal red teaming: A systematic approach to assessing legal risk of generative AI models,...

www.jdsupra.com

Aaron Sterling @aaronsterling.bsky.social · 2d

Screenshot of Google search results for "deposing LLMs to verify correctness"

Aaron Sterling @aaronsterling.bsky.social · 2d

Google gives me dozens of hits with that. But here's a link that discusses its use in the EU. cms-lawnow.com/en/ealerts/2...

Legal Issues on Red Teaming in Artificial Intelligence

1. Red TeamingRed teaming has its historical origins during the Cold War, when the US military prepared its forces by training them with simulated attacks from...

cms-lawnow.com

Aaron Sterling @aaronsterling.bsky.social · 2d

But accurate, given the applications I work with.

Aaron Sterling @aaronsterling.bsky.social · 2d

I imagine it depends on the field. I've published academic papers, and I'm in medical software writing clinic-ready applications. But don't take my word for it. You could look at the AI deployment department of The Mayo Clinic for a cutting edge example.

Aaron Sterling @aaronsterling.bsky.social · 2d

There's a close connection to legal rules of evidence, if that assuages you. The search term "Legal Red Team" or "Deposing LLMs to verify correctness" will give you more background.

Aaron Sterling @aaronsterling.bsky.social · 2d

I'm reporting lived experience. There's nothing to argue with. But to respond: grad students, professors, doctors, all make mistakes. If the error rate is on par with human error, you just have to perform the same verificiations you would anyway.

1 1

Aaron Sterling @aaronsterling.bsky.social · 2d

Common sense has limits. Eg: academic writing specialized. A sociologist can craft a prompt and tell LLM to do a deep dive and respond as an economist or an anthropologist, about recent papers that bear on sociology thing X. Or: write survey paper on subject Y, with cross-disciplinary references.