Lightnews — Scholar-powered news

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Four case studies with the gap between the reality of model use and their sandbox evaluations in audits... Definitely need to take a deeper dive, great presentation by Emily Black!

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Evaluations in the way the model would be deployed vs evaluations in only controlled unrealistic settings!

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Allowing companies to do isolated audits can lead to D-Hacking!! More robust testing is needed...

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Legal frameworks tend to have control over allocative decisions (Yes/No outcomes), which fit well with traditional ML systems... But not with GenAI systems

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Zollo et al: Towards Effective Discrimination Testing for Generative AI
#FAccT2025

1 1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Nuance of stereotype errors is so important to understand their true harms... Insightful presentation by @angelinawang.bsky.social

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Women tend to report stereotype-reinforcing errors as more harmful while men tend to report stereotype-violating errors as more harmful...

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Some items are more associated with men vs women (not surprising), but not all of them are equally harmful!!

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Cognitive beliefs, attitudes and behaviours... Three ways to measure harms ('pragmatic harms')

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Are all errors equally harmful? No! Stereotype-reinforcing errors vs stereotype-violating errors

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Our understanding of stereotypes sometimes isn't indicative of reality.... they can appear in both directions, or might exist simply without harm

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Wang et al: Measuring Machine Learning Harms from Stereotypes Requires Understanding Who Is Harmed by Which Errors in What Ways
#FAccT2025

1 1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Clear narrative and a great presentation by Cecilia Panigutti

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Risk-measuring studies - Bringing it back to risk measurement, but this time with a clearly defined objective instead of risk-uncovering as before... Not just whether a risk exists, but 'how severe' is it?

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Interface-design studies - Focus on UI design elements which impact user interaction

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Reverse-engineering studies - Narrower scope and in-depth studies of how algorithms work... Methodological precision in the key!

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Risk-uncovering studies - Typical starts from anecdotal evidence and help surface new risks

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

A review organized not by data collection technique, but by DSA risk management framework categories

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Narrative review of algorithmic auditing studies, practical recommendation for best practices, and mapping to DSA obligations...

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Panigutti et al: How to investigate algorithmic-driven risks in online platforms and search engines? A narrative review through the lens of the EU Digital Services Act
#FAccT2025

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Such a broad topic... Excellent presentation by @feliciajing.bsky.social

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Historical methods working alongside many other ways of auditing these models can help us take advantage of the broader scope of historical evaluations....

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

AI Audits have moved from bottom-up external evaluations to new age 'auditing companies'. While this has increased speed and scale, they have significantly narrowed the scope of auditing.

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Why the history of AI assessments? A study through the lens of historical methods can help us understand neglected areas of auditing.

1

prakharg.bsky.social @prakharg.bsky.social · Jun 25

Sandoval and Jing: Historical Methods for AI Evaluations, Assessments, and Audits
#FAccT2025

1