Lightnews — Scholar-powered news

2

Danny Wilf-Townsend @dannywt.bsky.social · 5d

Oh, also, from the parochial law professor standpoint (i.e., the most important standpoint) it makes "looking for hallucinations" a less reliable form of trying to monitor student AI use on exams or papers.

Danny Wilf-Townsend @dannywt.bsky.social · 5d

...has gone up. Certainly not to the point where I would recommend relying on AI for legal advice (or to write your briefs), but the size of the change does seem notable for at least those (and probably other) reasons.

Danny Wilf-Townsend @dannywt.bsky.social · 5d

...a few thoughts: (1) for practitioners using AI, I would think that fewer hallucinations makes it faster and cheaper to review/check/edit AI-generated outputs. And (2) for non-experts using AI, who aren't editing but just reading (or even relying on) answers, the quality of those answers..

Danny Wilf-Townsend @dannywt.bsky.social · 5d

I wouldn't draw a big conclusion specifically from this exercise; but it is consistent with my experience that hallucinations in answering legal questions seem way down in general now compared to, e.g., a year ago. In terms of the implications of that broader fact (if it is a fact)...

ChatGPT takes—and grades—my law school exam

Danny Wilf-Townsend @dannywt.bsky.social · 5d

One other note: across the five exam answers and dozens of answer evaluations generated here, I did not notice a single hallucination. This test wasn't designed to measure hallucination rates, but it's consistent with the general sense that they have dropped significantly

Danny Wilf-Townsend @dannywt.bsky.social · 7d

For my latest round of informal tests of large language models, I looked at how good different models are at taking a law school exam—and also whether they are capable of grading exam answers in a consistent and reasonably accurate way. 🧵
www.wilftownsend.net/p/chatgpt-ta...

The latest round of informal testing of large language models on legal questions

Job Opportunities | Office of the Illinois Attorney General

1 2

Reposted by Danny Wilf-Townsend

Alex Hemmer @ahemmer.bsky.social · 7d

Our office is again hiring one or more attorneys for a one-year fellowship to work directly with the Illinois Solicitor General and her team, beginning in August/September 2026.

www.governmentjobs.com/careers/ilag...

www.governmentjobs.com

2 17 15

Danny Wilf-Townsend @dannywt.bsky.social · 7d

Overall, GPT-5-Pro was good enough to use for my (informal) approach here—it was both internally consistent and looked good in accuracy spot checks. Its grades show some models scoring in the A- to A range, consistent with what others have found, too.

Danny Wilf-Townsend @dannywt.bsky.social · 7d

It turns out that some models are deeply inaccurate, and some are frequently inconsistent, but a few are reasonably consistent and accurate. And along the way, I learned that human graders are sometimes less consistent than we might hope.

Text describing consistency rates in human graders

Danny Wilf-Townsend @dannywt.bsky.social · 7d

The goal here wasn't to use them to grade student work—something I would not recommend. It's instead to see if they can be used to automate the evaluation process of other language models: can we use LLMs to get a sense of different models' relative capacities on legal questions?

ChatGPT takes—and grades—my law school exam

Danny Wilf-Townsend @dannywt.bsky.social · 7d

For my latest round of informal tests of large language models, I looked at how good different models are at taking a law school exam—and also whether they are capable of grading exam answers in a consistent and reasonably accurate way. 🧵
www.wilftownsend.net/p/chatgpt-ta...

The latest round of informal testing of large language models on legal questions

Beyond Algorithmic Disgorgement: Remedying Algorithmic Harms

Danny Wilf-Townsend @dannywt.bsky.social · 9d

The review of "The Deletion Remedy" also discusses Christina Lee's "Beyond Algorithmic Disgorgement," papers.ssrn.com/sol3/papers..... Christina is on the job market this year, and if I were on a hiring committee I would definitely be taking a look.

AI regulations are popping up around the world, and they mostly involve ex-ante risk assessment and mitigating those risks. But even with careful risk assessmen

papers.ssrn.com

2

Danny Wilf-Townsend @dannywt.bsky.social · 9d

It was very nice to have two of my recent articles featured in JOTWELL reviews this month—Maureen Carroll on "Deterring Unenforceable Terms," courtslaw.jotwell.com/should-draft...
and @margotkaminski.bsky.social on "The Deletion Remedy" cyber.jotwell.com/ai-disgorgem...

Should drafters be penalized for clearly unenforceable terms? - Courts Law

Daniel Wilf-Townsend, Deterring Unenforceable Terms, 111 Va. L. Rev. __ (forthcoming 2025), available at SSRN (June 6, 2024).Maureen CarrollMost of us (if not all) have entered a contract with one or ...

courtslaw.jotwell.com

1 1 3

Danny Wilf-Townsend @dannywt.bsky.social · Aug 28

Indicting based on sandwich type could lead to quite a pickle. Let's hope this jury's not on a roll.

1 1 6

Danny Wilf-Townsend @dannywt.bsky.social · Aug 22

Me too! They must be targeting proceduralists. Probably due to our lax morals.

No, Generative AI Didn’t Just Kill the Attorney-Client Privilege

Danny Wilf-Townsend @dannywt.bsky.social · Aug 12

A nice quick read from my colleague @JonahPerlin about an issue that I see a lot of people oversimplifying: whether an attorney's use of a generative AI tool waives privilege. This is an area where I'm very interested to see how the law develops. news.bloomberglaw.com/us-law-week/...

Opinion: Georgetown Law professor Jonah Perlin says using third-party technology doesn't categorically waive the attorney-client privilege.

news.bloomberglaw.com

2

Reposted by Danny Wilf-Townsend

Melanie Mitchell @melaniemitchell.bsky.social · Jul 21

In a stunning moment of self-delusion, the Wall Street Journal headline writers admitted that they don't know how LLM chatbots work.

43 480 3K

Danny Wilf-Townsend @dannywt.bsky.social · Jul 17

And thank you to @wertwhile.bsky.social for the shoutout and discussion of my work!

Danny Wilf-Townsend @dannywt.bsky.social · Jul 17

And I completely agree with what @wertwhile.bsky.social and @weisenthal.bsky.social say about OpenAI's o3 being the model to focus on—lots of people are forming impressions about AI capabilities based on older or less powerful tools, and aren't seeing the current level of capabilities as a result.

2 1

Danny Wilf-Townsend @dannywt.bsky.social · Jul 17

Finally, the work of mine that is discussed a bit is this informal testing of AI models on legal questions. The most recent post is here: www.wilftownsend.net/p/testing-ge...

Testing generative AI on legal questions—May 2025 update

The latest round of my informal testing

Judicial Economy in the Age of AI

Danny Wilf-Townsend @dannywt.bsky.social · Jul 17

On the issue of AI increasing numbers of lawsuits and a "Jevons paradox" for litigation, I would recommend @arbel.bsky.social's work here: papers.ssrn.com/sol3/papers....
and Henry Thompson has some interesting thoughts about these dynamics as well: henryathompson.com/s/Thompson-A...

Individuals do not vindicate the majority of their legal claims because of access to justice barriers. This entrenched state of affairs is now facing a disrupti

papers.ssrn.com