Joel Mire
@joelmire.bsky.social
84 followers 190 following 11 posts
Master’s student @ltiatcmu.bsky.social. he/him
Posts Media Videos Starter Packs
Reposted by Joel Mire
ninabegus.bsky.social
10 years after the initial idea, Artificial Humanities is here! Thanks so much to all who have preordered it. I hope you enjoy reading it and find this research approach as generative as I do. More to come!
Reposted by Joel Mire
jordant.bsky.social
🏳️‍🌈🎨💻📢 Happy to share our workshop study on queer artists’ experiences critically engaging with GenAI

Looking forward to presenting this work at #FAccT2025 and you can read a pre-print here:
arxiv.org/abs/2503.09805
Academic paper titled un-straightening generative ai: how queer artists surface and challenge the normativity of generative ai models

The piece is written by Jordan Taylor, Joel Mire, Franchesca Spektor, Alicia DeVrio, Maarten Sap, Haiyi Zhu, and Sarah Fox.

As an image titled 24 attempts at intimacy showing 24 ai generated images with the word intimacy, none of which seems to include same gender couples
Reposted by Joel Mire
lindiatjuatja.bsky.social
When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs:

🧵1/9
Reposted by Joel Mire
shaily99.bsky.social
🖋️ Curious how writing differs across (research) cultures?
🚩 Tired of “cultural” evals that don't consult people?

We engaged with interdisciplinary researchers to identify & measure ✨cultural norms✨in scientific writing, and show that❗LLMs flatten them❗

📜 arxiv.org/abs/2506.00784

[1/11]
An overview of the work “Research Borderlands: Analysing Writing Across Research Cultures” by Shaily Bhatt, Tal August, and Maria Antoniak. The overview describes that We  survey and interview interdisciplinary researchers (§3) to develop a framework of writing norms that vary across research cultures (§4) and operationalise them using computational metrics (§5). We then use this evaluation suite for two large-scale quantitative analyses: (a) surfacing variations in writing across 11 communities (§6); (b) evaluating the cultural competence of LLMs when adapting writing from one community to another (§7).
joelmire.bsky.social
This looks incredible! Thanks for sharing the syllabus!
Reposted by Joel Mire
saumyamalik.bsky.social
I’m thrilled to share RewardBench 2 📊— We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!
Reposted by Joel Mire
lucy3.bsky.social
I'm joining Wisconsin CS as an assistant professor in fall 2026!! There, I'll continue working on language models, computational social science, & responsible AI. 🌲🧀🚣🏻‍♀️ Apply to be my PhD student!

Before then, I'll postdoc for a year in the NLP group at another UW 🏔️ in the Pacific Northwest
Wisconsin-Madison's tree-filled campus, next to a big shiny lake A computer render of the interior of the new computer science, information science, and statistics building. A staircase crosses an open atrium with visibility across multiple floors
Reposted by Joel Mire
nlpxuhui.bsky.social
When interacting with ChatGPT, have you wondered if they would ever "lie" to you? We found that under pressure, LLMs often choose deception. Our new #NAACL2025 paper, "AI-LIEDAR ," reveals models were truthful less than 50% of the time when faced with utility-truthfulness conflicts! 🤯 1/
Reposted by Joel Mire
mariaa.bsky.social
I updated our 🔭StorySeeker demo. Aimed at beginners, it briefly walks through loading our model from Hugging Face, loading your own text dataset, predicting whether each text contains a story, and topic modeling and exploring the results. Runs in your browser, no installation needed!
A bar plot comparing the storytelling rates for different topics in the example dataset of congressional speeches. There are often large differences between storytelling and non-storytelling for individual topics. For example, the topic whose top words read "NUM, years, service, great, state" has much more storytelling that non-storytelling. The top five congressional speeches for the topic "NUM, years, service, great state." All of the documents honor the lives of important people.
Reposted by Joel Mire
mariaa.bsky.social
New work on multimodal framing! 💫

Some fun results: comparisons of the same frame when expressed in images vs texts. When the "crime" frame is expressed in the article text, there are more political words in the text, but when the frame is expressed in the article image, more police words.
Table 2 from the paper, showing results of the "Fightin' Words" algorithm to rank words by their association with image vs text frames. Results are shown for the "crime" and "quality of life" frames. Figure 13 from the paper showing scatter plots of the topic space (UMAP reduction of a 5k sample of the generated topic descriptions) with points highlighted if they were assigned the "political frame." The two plots display quite different distributions.
joelmire.bsky.social
Our work builds on sociolinguistic and NLP research on AAL and recent translation methods. Check out the paper for details! We hope others extend this work, e.g., to investigate or mitigate reward model biases against more dialects. (9/10)
joelmire.bsky.social
These results point to representational and quality-of-service harms for AAL speakers. ⚠️They also highlight complex ethical questions about the desired behavior of LLMs concerning AAL. (8/10)
joelmire.bsky.social
Finally, we show that the reward models strongly incentivize steering conversations toward WME, even when prompted with AAL. 🗣️🔄 (7/10)
Bar chart showing the results from t-tests comparing rewards assigned to dialect mirroring (completion dialect matches prompt dialect) vs non-mirroring conditions (completion dialect differs from prompt dialect) inputs. The results show statistically significant preferences for responding in WME--regardless of whether the prompt was WME or AAL--for all models.
joelmire.bsky.social
Also, for most models, rewards are negatively correlated with the predicted AAL-ness of a text (based on a pre-existing dialect detection tool). (6/10)
Bar chart showing Pearson correlation coefficients between reward model score and AAL-ness score from a pre-existing dialect detection tool. The chart shows a statistically significant negative correlation between these variables for most models.
joelmire.bsky.social
Next, we show that most reward models predict lower rewards for AAL texts ⬇️ (5/10)
Bar chart showing the cohen's d effect sizes from t-tests comparing raw reward scores assigned to WME vs. AAL texts. All results show a significant dispreference for AAL texts.
joelmire.bsky.social
First, we see a significant drop in performance (-4% accuracy on average) in assigning higher rewards to human-preferred completions when processing AAL texts vs. WME texts. 📉 (4/10)
Line chart showing that reward models are less accurate at assigning higher rewards to human-preferred completions when processing paired WME vs. AAL texts.
joelmire.bsky.social
We introduce morphosyntactic & phonological features of AAL into WME texts from the RewardBench dataset using validated automatic translation methods. Then, we test 17 reward models for implicit anti-AAL dialect biases. 📊 (3/10)
Diagram depicting several ways we combine prompts and completions in White Mainstream English (WME) and African American Language (AAL) to evaluate dialect biases in reward models. Also, the image contains text summaries of our main findings: accuracy drop for AAL, moderate dispreference for AAL-aligned texts, and WME responses for AAL prompts.
joelmire.bsky.social
We develop a framework for evaluating dialect biases in reward models and conduct a case study on biases against African American Language (AAL) relative to White Mainstream English (WME). 🔍 (2/10)
joelmire.bsky.social
Reward models for LMs are meant to align outputs with human preferences—but do they accidentally encode dialect biases? 🤔

Excited to share our paper on biases against African American Language in reward models, accepted to #NAACL2025 Findings! 🎉

Paper: arxiv.org/abs/2502.12858 (1/10)
Screenshot of Arxiv paper title, "Rejected Dialects: Biases Against African American Language in Reward Models," and author list: Joel Mire, Zubin Trivadi Aysola, Daniel Chechelnitsky, Nicholas Deas, Chrysoula Zerva, and Maarten Sap.