Aaron Sterling
@aaronsterling.bsky.social
440 followers 1.5K following 710 posts
CEO, Thistleseeds. Personal account. Current primary project: tech for substance use disorder programs.
Posts Media Videos Starter Packs
aaronsterling.bsky.social
Lonely women have money. Also: sidestepping it is tricky. In the last week, I've talked to two different lawyers about what to do if we ask someone to upload a photo and they upload their dick. Ignore? Laugh along? Admonish? Is behavior different if chatbot avi is male/female? It's a new world.
aaronsterling.bsky.social
In January, I thought there was at least a 51% chance of HIPAA being de facto (and maybe de jure) repealed, motored by the interest in access to reproductive health data. I think that chance is higher now, though not clear whether large health data vendors will change behavior even if it happens.
Reposted by Aaron Sterling
timfduffy.com
Notes on the Haiku 4.5 system card: assets.anthropic.com/m/12f214efcc...

Anthropic is releasing it as ASL-2, unlike Sonnet 4.5/Opus 4+ which are considered ASL-3
aaronsterling.bsky.social
"Boyfriend" is important. Women spend more per OnlyFans transaction per capita than men do. More men subscribe to OF than women, of course. But perhaps the text/story primacy of ChatGPT will mean there are more female paid users than male, while OF is more visual.
aaronsterling.bsky.social
There's a months-long Trust and Safety whackamole where goon communities post porn jailbreak prompts to Github, OpenAI renders the jailbreak ineffective, and the cycle continues. Might be easier to manage if there's an official front door.
Reposted by Aaron Sterling
werner.social
No data, no AI, no progress. My @AmazonScience article explores how multi-layered mapping + petabyte-scale cloud infrastructure helps save lives in time of crisis. Building AI without addressing the fundamental data divide means solving the wrong problems. amazon.science/blog/why-ai-...
Why AI for good depends on good data
New technologies are helping vulnerable communities produce maps that integrate topographical, infrastructural, seasonal, and real-time data — an essential tool for many humanitarian endeavors.
amazon.science
aaronsterling.bsky.social
My "original proposal" was a recommendation to an experienced medical researcher. Research maturity gives you the ability to prima facie check that a methodology appears evidence-based. Hire a subject matter expert to verify, if you want to deploy to clinic. (My last post here, best wishes.)
aaronsterling.bsky.social
What you describe is better done by humans. The advantage of LLMs is the ability to do the same work at scale. To find useful references in a field you have no experience in, to process petabytes of health data, etc.
aaronsterling.bsky.social
That is a proven technique to receive more useful responses: fewer errors, and the errors that exist are easier to spot. You need to check work, just as you would with a human assistant. It's much like supervising someone with a lot of intelligence and ego, but little practical experience.
aaronsterling.bsky.social
I'm not having a rhetorical discussion; I'm telling you what's happening in real life. Risk Management departments are managing LLM error similarly to how they manage human error. Legal red teaming is a tool for this. Management of risk is possible; elimination is impossible, and always has been.
aaronsterling.bsky.social
One of the core techniques of legal red teaming is to depose models. Think of it like a variation of moot court. Even when attorneys are not directly involved in such depositions (as that particular group often is), it's a useful metaphor, as discussed in the screenshot I posted earlier.
aaronsterling.bsky.social
A cautious use is: cover the attached file with automated tests according to (project testing standards document). You can get tests for dozens of weird edge cases in a few minutes. Can specify unit tests, integration tests, etc., depending on if you want dependencies to be run or mocked.
aaronsterling.bsky.social
Creators and users of LLMs potentially could. My own work with LLMs is bound by HIPAA and a "super-HIPAA" requirement called 42 CFR Part 2. A negligent data breach could cause the negligent person to face prison time. That's part of what motivates deposition of LLMs: to reduce corporate legal risk.
aaronsterling.bsky.social
Like any powerful tool, it's easy to use incorrectly. It requires a fair amount of training and time investment to write strong prompts, to write tests that verify accuracy of those prompts, etc. Some of my prompts are multi-page standards documents.
aaronsterling.bsky.social
Most software programmers are not managers. Good prompting requires skills very similar to writing requirements for contract work. It's a learnable skill, but most people using LLMs have not yet learned it.
aaronsterling.bsky.social
I have a meeting in three minutes. But here is one white paper that appeared in that search. I attended a talk by these folks earlier this year. Teams of attorneys and data scientists depose LLMs as part of a legal red team. www.jdsupra.com/legalnews/le...
Legal Red Teaming, One Year In: Reports from the Field | JD Supra
Introduction - In our June 2024 white paper, Legal red teaming: A systematic approach to assessing legal risk of generative AI models,...
www.jdsupra.com
aaronsterling.bsky.social
But accurate, given the applications I work with.
aaronsterling.bsky.social
I imagine it depends on the field. I've published academic papers, and I'm in medical software writing clinic-ready applications. But don't take my word for it. You could look at the AI deployment department of The Mayo Clinic for a cutting edge example.
aaronsterling.bsky.social
There's a close connection to legal rules of evidence, if that assuages you. The search term "Legal Red Team" or "Deposing LLMs to verify correctness" will give you more background.
aaronsterling.bsky.social
I'm reporting lived experience. There's nothing to argue with. But to respond: grad students, professors, doctors, all make mistakes. If the error rate is on par with human error, you just have to perform the same verificiations you would anyway.
aaronsterling.bsky.social
Common sense has limits. Eg: academic writing specialized. A sociologist can craft a prompt and tell LLM to do a deep dive and respond as an economist or an anthropologist, about recent papers that bear on sociology thing X. Or: write survey paper on subject Y, with cross-disciplinary references.