Agathe Balayn
@amabalayn.bsky.social
140 followers 230 following 5 posts
Postdoc researcher @MicrosoftResearch, previously @TUDelft. Interested in the intricacies of AI production and their social and political economical impacts; gap policies-practices (AI fairness, explainability, transparency, assessments)
Posts Media Videos Starter Packs
Reposted by Agathe Balayn
aolteanu.bsky.social
We have to talk about rigor in AI work and what it should entail. The reality is that impoverished notions of rigor do not only lead to some one-off undesirable outcomes but can have a deeply formative impact on the scientific integrity and quality of both AI research and practice 1/
Print screen of the first page of a paper pre-print titled "Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor" by Olteanu et al.  Paper abstract: "In AI research and practice, rigor remains largely understood in terms of methodological rigor -- such as whether mathematical, statistical, or computational methods are correctly applied. We argue that this narrow conception of rigor has contributed to the concerns raised by the responsible AI community, including overblown claims about AI capabilities. Our position is that a broader conception of what rigorous AI research and practice should entail is needed. We believe such a conception -- in addition to a more expansive understanding of (1) methodological rigor -- should include aspects related to (2) what background knowledge informs what to work on (epistemic rigor); (3) how disciplinary, community, or personal norms, standards, or beliefs influence the work (normative rigor); (4) how clearly articulated the theoretical constructs under use are (conceptual rigor); (5) what is reported and how (reporting rigor); and (6) how well-supported the inferences from existing evidence are (interpretative rigor). In doing so, we also aim to provide useful language and a framework for much-needed dialogue about the AI community's work by researchers, policymakers, journalists, and other stakeholders."
amabalayn.bsky.social
At the #HEAL workshop, I'll present "Systematizing During Measurement Enables Broader Stakeholder Participation" on the ways we can further structure LLM evaluations and open them for deliberation. A project led by @hannawallach.bsky.social
amabalayn.bsky.social
These results can serve to refine current AI regulations that touch upon "trust" **within the AI supply chain** and the "trustworthiness" of the resulting AI systems.
agathe-balayn.github.io/assets/pdf/b...
agathe-balayn.github.io
amabalayn.bsky.social
At the main conference, I'll present our work "Unpacking Trust Dynamics in the LLM Supply Chain: An Empirical Exploration to Foster Trustworthy LLM Production And Use" (honorable mention) on how trust relations in the LLM supply chain affect the resulting AI system.
agathe-balayn.github.io
amabalayn.bsky.social
At the #STAIG workshop, I'll discuss our empirical study of *pig farming* supply chains. 🐷
We show how inconspicuous software engineering practices might transform farming environments negatively, and how the harm-based approach to AI regulation might not enable to attend to these transformations.
amabalayn.bsky.social
I will be at #CHI25 in person this week 🇯🇵
I'm looking forward to chat about **AI supply chains** from socio-technical & organizational / regulatory & governance / political economic lenses.
I'll present my work at the main conference (honorable mention), and attend the #HEAL and #STAIG workshops.