Lily Chen
@lilywchen.bsky.social
11 followers 12 following 5 posts
MIT
Posts Media Videos Starter Packs
lilywchen.bsky.social
To address these challenges, we propose a communication model that:
- clarifies intent through dialogue
- guides claims toward verifiable evidence
- explains diverse expert perspectives instead of forcing consensus

It reframes medical fact-checking as patient–expert dialogue

4/
This figure illustrates the communication model for fact-checking, where the system engages the patient in dialogue—asking clarifying questions, filling contextual gaps, and verifying the claim and addressing any misconceptions.
lilywchen.bsky.social
Verifying medical claims wasn’t straightforward for experts. They struggled with:

1️⃣ linking claims to evidence
2️⃣ interpreting underspecified or misguided claims
3️⃣ labeling nuanced claims—often with disagreement

These challenges are inherent to end-to-end fact-checking 🚧

3/
Inter-annotator agreement across different stages of the fact-checking pipeline is shown. Blue labels represent abstract-level annotations; pink labels represent synthesis-level annotations. This claim is unverifiable because no randomized controlled trials (RCTs) have examined the interactions between grapefruit, Oxcarbazepine, and epilepsy. Conducting such a study may also be infeasible. All experts judged the claim as unverifiable based on the available RCTs. They attributed this to the claim’s high specificity, noting that it is unlikely—and potentially unethical—for a trial to match the described scenario. Examples of underspecified claims are also shown.
lilywchen.bsky.social
We study real-world medical claims from Reddit, preserving post context and verifying them with RCT abstracts. 📄
Six experts annotated 20 claims, each with 10 abstracts.

Annotations span:
1️⃣ abstract relevance
2️⃣ claim-level evidence quality
3️⃣ explanations citing abstracts

2/
Interface view showing a medical claim with Reddit post context and extracted PIO elements. Interface view showing ten RCT abstracts with relevance annotation fields. Interface view of claim-level evidence tiering from aggregated abstract annotations. Interface view of the final synthesis step: expert annotators assign a claim verification label and provide grounded explanations citing supporting abstracts.
lilywchen.bsky.social
Are we fact-checking medical claims the right way? 🩺🤔

Probably not. In our study, even experts struggled to verify Reddit health claims using end-to-end systems.

We show why—and argue fact-checking should be a dialogue, with patients in the loop

arxiv.org/abs/2506.20876

🧵1/
An overview of our AI-in-the-loop expert study pipeline: given a claim from a subreddit, we extract the PIO elements and retrieve the evidence automatically. The evidence, its context, and the evidence are then presented to a medical expert to provide a judgment and a rationale for the factuality of the claim.