Hayoung Jung
@hayoungjung.bsky.social
110 followers 130 following 32 posts
PhD student at @princetoncitp.bsky.social. Previously @uwcse.bsky.social website: hayoungjung.me
Posts Media Videos Starter Packs
Reposted by Hayoung Jung
frasalvi.bsky.social
🌱✨ Life update: I just started my PhD at Princeton University!

I will be supervised by @manoelhortaribeiro.bsky.social and affiliated with Princeton CITP.

It's only been a month, but the energy feels amazing —very grateful for such a welcoming community. Excited for what’s ahead! 🚀
Reposted by Hayoung Jung
manoelhortaribeiro.bsky.social
Social media feeds today are optimized for engagement, often leading to misalignment between users' intentions and technology use.

In a new paper, we introduce Bonsai, a tool to create feeds based on stated preferences, rather than predicted engagement.

arxiv.org/abs/2509.10776
hayoungjung.bsky.social
Lastly, I would like to thank my awesome collaborators @shravika-mittal.bsky.social, Ananya Aatreya (my first mentee!), @navreetkaur.bsky.social, and faculty mentors who taught me a lot during this project @tanumitra.bsky.social @munmun10.bsky.social!
hayoungjung.bsky.social
🙌 We hope public health, platforms, & researchers build on MythTriage to scale OUD myth detection on video platforms.
To support this, we’re releasing everything:
🧠 Models: huggingface.co/SocialCompUW...
💻 Code: github.com/hayoungjungg...
📊 Data: github.com/hayoungjungg...
SocialCompUW/youtube-opioid-myth-detect-M1 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
hayoungjung.bsky.social
🤩 Lastly, we’re excited because this work shows how a decade-old, but simple idea—model cascades—scales with LLM advancements to tackle real high-stakes health issues like OUD myths.

Past work tested model cascades on standard benchmarks (e.g., SQuAD). We validate them in the wild!
hayoungjung.bsky.social
Our findings offer actionable insights in the context of the ongoing opioid crisis—showing the value of MythTriage:

👩‍⚕️Public health: Inform targeted interventions & debunk myths.
🛡️Platforms: Provides a scalable auditing pipeline to flag high-risk content & improve moderation.
hayoungjung.bsky.social
📊Finding #3: YouTube’s recommendation continued surfacing myth-supporting content.

➡️12.7% of recs from myth videos led to more myths initially—rising to 22% at deeper levels.

⚠️ Moderation should target these rec pathways that reinforce harmful myths.
hayoungjung.bsky.social
📊 Finding #2: How you filter your search results matters! Switching from “Relevance” to “Upload Date” or “Rating” increases exposure to myths—echoing the same patterns seen in my COVID-19 misinformation audit: ojs.aaai.org/index.php/IC...

😬A few clicks can change your exposure to myths!
hayoungjung.bsky.social
🫶Thanks to MythTriage, we present the first large-scale study of OUD-related myths on YouTube!

📊 Finding #1: Nearly 20% of YouTube search results support OUD myths, while 30% oppose.

😰Despite more opposing, myth-supporting content is widespread—and risks shaping how people understand treatment.
hayoungjung.bsky.social
⚙️So how does MythTriage perform?
📊 Achieves 0.68-0.86 macro F1 and defers only 5-67% of the examples to the costly LLM.

In practice, MythTriage:
💸 Cuts financial costs by 98% vs experts and by 94% vs LLM labeling
⏱️ Cuts time costs by 96% vs experts & by 76% vs LLM labeling
hayoungjung.bsky.social
🚀 Our solution: MythTriage
👉 Uses lightweight DeBERTa for routine cases
👉 Defers harder ones to GPT-4o (high-performing but costly)

The trick? We distilled DeBERTa on GPT-4o’s synthetic labels—achieving strong performance without massive expert-labeled data.
hayoungjung.bsky.social
💡Challenge: Detecting OUD myths on video platforms at *scale* is tough–clinical expertise and labeling are essential, but it is slow and costly.

🤖LLMs show promise, but high compute & API costs—especially for long-form video—limit their practicality for large-scale detection.
hayoungjung.bsky.social
🩺 To rigorously detect OUD myths in our datasets, we collaborated closely with clinical experts to:

✅Validate eight pervasive myths on OUD (see examples below!)
✅Create and refine annotation guidelines
✅Build a gold-standard dataset: 310 videos labeled across 8 myths (~2.5K expert labels).
hayoungjung.bsky.social
To measure the scale and prevalence of myths on YouTube, we curated opioid and OUD search queries based on real-world search interests. Using these queries, we built two datasets on YouTube:

1️⃣ OUD Search Dataset: 2.9K search results
2️⃣ OUD Recs Dataset: 343K video recommendations
hayoungjung.bsky.social
🛜Facing offline stigma, many turn to online platforms (YouTube) for health info & recovery.

‼️But myths fuel treatment hesitancy, distrust in healthcare, & stigma.

🤔Understanding the scale of myths is crucial for health officials & platforms to design effective interventions.
hayoungjung.bsky.social
🚨YouTube is a key source of health info, but it’s also rife with dangerous myths on opioid use disorder (OUD), a leading cause of death in the U.S.

To understand the scale of such misinformation, our #EMNLP2025 paper introduces MythTriage, a scalable system to detect OUD myth🧵
Reposted by Hayoung Jung
dustinbwright.com
🎉 Our work on attribution in summarization is now accepted to #EMNLP2025 main! 🎉

"Unstructured Evidence Attribution for Long Context Query Focused Summarization"

w/ @zainmujahid.me , Lu Wang, @iaugenstein.bsky.social , and @davidjurgens.bsky.social
Reposted by Hayoung Jung
omelmalki.bsky.social
Want $50 to create your own feed with natural language?

We’re a group of researchers at Princeton studying how people could more easily build their own feeds.

Join our study to try it out and tell us what you think of the experience!

👉 Sign up at forms.gle/MkSGKzxDfBEc... (takes less than 1 min)
hayoungjung.bsky.social
I would also love to be added!!
hayoungjung.bsky.social
On my way to Copenhagen, where I will give an invited talk at a workshop and present this work at ICWSM! Super excited to meet everyone -- please DM me if you would like to chat!
hayoungjung.bsky.social
How does YouTube’s search algorithm handle COVID🦠 misinfo in the United States🇺🇸(US) and South Africa🇿🇦(SA)?

In our #icwsm '25 paper w/ @prerna6.bsky.social @tanumitra.bsky.social, we found bots in SA received significantly more misinfo in top-10 search results, which accounts for 95% of user traffic
Title and header describing
hayoungjung.bsky.social
Thank you for the shoutout, Joey! :)
hayoungjung.bsky.social
13/ Huge thanks to Prof. Tanu Mitra (@tanumitra.bsky.social) for this incredible opportunity and to my amazing PhD mentor, Prof. Prerna Juneja (@prerna6.bsky.social), for guiding me throughout. I have learned so much from you and your support. Thank you for introducing me to the world of research!🙌