Hope Schroeder
@hopeschroeder.bsky.social
330 followers 280 following 54 posts
Studying NLP, CSS, and Human-AI interaction. PhD student @MIT. Previously at Microsoft FATE + CSS, Oxford Internet Institute, Stanford Symbolic Systems hopeschroeder.com
Posts Media Videos Starter Packs
Reposted by Hope Schroeder
nancybaym.bsky.social
We may have the chance to hire an outstanding researcher 3+ years post PhD to join Tarleton Gillespie, Mary Gray and me in Cambridge MA bringing critical sociotechnical perspectives to bear on new technologies.

jobs.careers.microsoft.com/global/en/jo...
Search Jobs | Microsoft Careers
https://jobs.careers.microsoft.com/global/en/job/1849026/Principal-Researcher-–-Sociotechnical-Systems-–-Microsoft-Research
hopeschroeder.bsky.social
Thanks for sharing- not just our paper but also learned a lot from this list! :)
hopeschroeder.bsky.social
Awesome work and great presentation! Congrats!! ⚡️
hopeschroeder.bsky.social
Talking about this work tomorrow (Wed, July 23rd) at #IC2S2 in Norrköping during the 11 am session on LLMs, Annotation, and Synthetic Data! Come hear about this and more!
hopeschroeder.bsky.social
Implications vary by task and domain. Researchers should clearly define their annotation constructs before reviewing LLM annotations. We are subject to anchoring bias that can affect our evaluations, or even our research findings!
Read more: arxiv.org/abs/2507.15821
hopeschroeder.bsky.social
Using LLM-influenced labels, even when a crowd of humans reviews them and is aggregated into a set of crowd labels, can lead to 1) different findings when used in data analysis and 2) different results when used as a basis of evaluating LLM performance on the task.
hopeschroeder.bsky.social
What happens if we use LLM-influenced labels as ground truth when evaluating LLM performance on these tasks? We can seriously overestimate LLM performance on these tasks. F1 scores for some tasks were +.5 higher when evaluated using LLM-influenced labels as ground truth!
hopeschroeder.bsky.social
However… annotators STRONGLY took LLM suggestions: just 40% of human crowd labels overlap with LLM baselines, but overlap jumps to over 80% when LLM suggestions are given (varied crowd thresholds and conditions shown in graph). Beware: Humans are subject to anchoring bias!
hopeschroeder.bsky.social
Some findings: ⚠️ reviewing LLM suggestions did not make annotators go faster, and often slowed them down! OTOH, having LLM assistance made annotators ❗more self-confident❗in their task and content understanding at no identified cost to their tested task understanding.
hopeschroeder.bsky.social
We conducted experiments where over 410 unique annotators generated with over 7,000 annotations across three LLM assistance conditions of varying strengths against a control, using two different models, and two different complex, subjective annotation tasks.
hopeschroeder.bsky.social
LLMs can be fast and promising annotators, so letting human annotators "review" first-pass LLM annotations in interpretive tasks is tempting. How does this impact productivity, annotators, evaluating LLM performance on subjective tasks and downstream data analysis?
hopeschroeder.bsky.social
Thanks for attending and for your comments!!
hopeschroeder.bsky.social
*What should FAccT do?* We discuss a need for the conference to clarify its policies next year, engage scholars from different disciplines when considering policy on this delicate subject, and engage authors in reflexive practice upstream of paper-writing, potentially through CRAFT.
hopeschroeder.bsky.social
*Are disclosed features connected to described impacts?* Disclosed features are much less commonly described in terms of impacts the feature had on the research, which may leave a gap for readers to jump to conclusions about how the disclosed feature impacted the work.
hopeschroeder.bsky.social
*What do authors disclose in positionality statements?* We conducted fine-grained annotation of the statements. We find academic background and training are disclosed most often, but identity features like race and gender are also common.
hopeschroeder.bsky.social
We reviewed papers from the entire history of FAccT for the presence of positionality statements. We find 2024 marked a significant proportional increase in papers that included positionality statements, likely as a result of PC recommendations:
hopeschroeder.bsky.social
With ongoing reflection on the impact of computing on society, and the role of researchers in shaping impacts, positionality statements have become more common in computing venues, but little is known about their contents or the impact of conference policy on their presence.
hopeschroeder.bsky.social
1) Thrilled to be at #FAccT for the first time this week, representing a meta-research paper on positionality statements at FAccT from 2018-2024, in collaboration with @s010n.bsky.social and Akshansh Pareek, "Disclosure without Engagement: An Empirical Review of Positionality Statements at FAccT"
hopeschroeder.bsky.social
Democracy needs you! Super excited to be co-organizing this NLP for Democracy workshop at #COLM2025 with a wonderful group. Abstracts due June 19th!
jmendelsohn2.bsky.social
📣 Super excited to organize the first workshop on ✨NLP for Democracy✨ at COLM @colmweb.org!!

Check out our website: sites.google.com/andrew.cmu.e...

Call for submissions (extended abstracts) due June 19, 11:59pm AoE

#COLM2025 #LLMs #NLP #NLProc #ComputationalSocialScience
NLP 4 Democracy - COLM 2025
sites.google.com
Reposted by Hope Schroeder
artjomshl.bsky.social
a whole cluster of postdocs and phd positions in Tartu in Digital Humanities / Computational Social Science / AI under the umbrella of big European projects.

consider sharing please!
andreskarjus.bsky.social
While NSF freezes funding and UK leading unis cut departments, EU still supports important research. 2 Horizon funded centres in #Tartu #Estonia hiring phds&postdocs right now in CSS, AI, DH, text corpora. See
- post below
- DigiTS ut.ee/en/job-offer... (our good colleagues in humanities).