Lightnews — Scholar-powered news

Nick Vincent

@nickmvincent.bsky.social

290 followers 300 following 69 posts

Studying people and computers (https://www.nickmvincent.com/) Blogging about data and steering AI (https://dataleverage.substack.com/)

www.nickmvincent.com

Posts Media Videos Starter Packs

Pinned

Nick Vincent @nickmvincent.bsky.social · Nov 19

Hi Bluesky (+ many friendly familiar faces). I'm a researcher in HCI + ML, assistant prof at Simon Fraser University up in BC, and working on "healthy data flow". Doing a quick thread recapping some recent writing (blogs, pre-prints, etc.) that capture the things I work on and talk about!

1 4 15

Nick Vincent @nickmvincent.bsky.social · 20d

Anyone compiling discussions/thoughts on emerging licensing schemes and preference signals? eg rslstandard.org and github.com/creativecomm... ? externalizing some notes here datalicenses.org, but want to find where these discussions are happening!

RSL: Really Simple Licensing

The open content licensing standard for the AI-first Internet

rslstandard.org

1 2

Nick Vincent @nickmvincent.bsky.social · Aug 14

Excited to be giving a talk on data leverage to the Singapore AI Safety Hub. Trying to capture updated thoughts from recent years, and have long wanted to better connect leverage/collective bargaining to the safety context.

Nick Vincent @nickmvincent.bsky.social · Aug 14

About a week away from the deadline to submit to the

✨ Workshop on Algorithmic Collective Action (ACA) ✨

acaworkshop.github.io

at NeurIPS 2025!

About the workshop – ACA@NeurIPS

acaworkshop.github.io

Nick Vincent @nickmvincent.bsky.social · Aug 8

Follow up, tying together "AI as ranking chunks of human records" with "eval leverage" and "dataset details as quality signals": dataleverage.substack.com/p/how-do-we-...

And related, "eval leverage": dataleverage.substack.com/p/evaluation...

How do we know our AI output is good? Double checks, bar charts, vibes, and training data.

Connecting evaluation and dataset documentation via the lens of "AI as ranking".

dataleverage.substack.com

Nick Vincent @nickmvincent.bsky.social · Aug 8

(1) ongoing challenges in benchmarking, (2) challenges in communicating benchmarks to the public, (3) dataset documentation, and (4) post-hoc dataset "reverse engineering"

The original post: dataleverage.substack.com/p/selling-ag...

1 1

Nick Vincent @nickmvincent.bsky.social · Aug 8

who paid that Dr for a verified attestation with provenance can use this attestation as a quality signal; a promise to consumers about the exact nature of the evaluation. A "9/10 dentists recommend" for a chatbot.

More generally, I think there are interesting connections between current discourse &

1 1

Nick Vincent @nickmvincent.bsky.social · Aug 8

For some types of info, we can maybe treat as open and focus on selling convenient/"nice" packages (ala Wikimedia Enterprise)

But attestations provide another object to transact over. Valuable info (a Dr giving thumbs up/down on medical responses) may leak, but the AI developer

1 1

Nick Vincent @nickmvincent.bsky.social · Aug 8

So in a post-AI world, to help people transact over work that produces information, we likely need:
- individual property-ish rights over info (not a great way to go, IMO)
- rights that enable collective bargaining (good!)
- or...

1 2

Nick Vincent @nickmvincent.bsky.social · Aug 8

The core challenge: many inputs into AI are information, and thus hard to design efficient markets for. Info is hard to exclude (pre-training data remains very hard to exclude, but even post-training data may be hard without sufficient effort)

1 1

Nick Vincent @nickmvincent.bsky.social · Aug 8

It looks like some skepticism was warranted (not much progress towards this vision yet). I do think "dataset details as quality signals" is still possible though, and could play a key role in addressing looming information economics challenges.

1 1 1

Nick Vincent @nickmvincent.bsky.social · Aug 8

🧵In several recent posts, I speculated that eventually, dataset details may become an important quality signal for consumers choosing AI products.

"This model is good for asking health questions, because 10,000 doctors attested to supporting training and/or eval". Etc.

1 1 3

Nick Vincent @nickmvincent.bsky.social · Jul 16

Around ICML with loose evening plans and an interest in "public AI", Canadian sovereign AI, or anything related? Swing by the Internet Archive Canada between 5p and 7p lu.ma/7rjoaxts

Oh Canada! An AI Happy Hour @ ICML 2025 · Luma

Whether you're Canadian or one of our friends from around the world, please join us for some drinks and conversation to chat about life, papers, AI, and...…

lu.ma

2 3

Nick Vincent @nickmvincent.bsky.social · Jun 24

Finally, I recently shared a preprint that relates deeply to the above ideas, on Collective Bargaining for Information: arxiv.org/abs/2506.10272, and have a blog post on this as well: dataleverage.substack.com/p/on-ai-driv...

On AI-driven Job Apocalypses and Collective Bargaining for Information

Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.

dataleverage.substack.com

1 4

Nick Vincent @nickmvincent.bsky.social · Jun 24

And we have a blog post on algorithmic collective action with multiple collectives! dataleverage.substack.com/p/algorithmi...

Algorithmic Collective Action With Two Collectives [crosspost]

This post was written by Aditya Karan, with support from Nick Vincent and Karrie Karahalios to accompany a FAccT 2025 paper. It was originally published on Jun 19, 2025 via the Crowd Dynamics Lab blog...

dataleverage.substack.com

1 1 3

Nick Vincent @nickmvincent.bsky.social · Jun 24

These blog posts expand on attentional agency:
- genAI as ranking chunks of info: dataleverage.substack.com/p/google-and...
- utility of AI stems from people: dataleverage.substack.com/p/each-insta...
- connection to evals: dataleverage.substack.com/p/how-do-we-...

Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking

It's ranking information all the way down.

dataleverage.substack.com

1 1 1

Nick Vincent @nickmvincent.bsky.social · Jun 24

[FAccT-related link round-up]: It was great to present on measuring Attentional Agency with Zachary Wojtowicz at FAccT. Here's our paper on ACM DL: dl.acm.org/doi/10.1145/...

On Thurs Aditya Karan will present on collective action dl.acm.org/doi/10.1145/... at 10:57 (New Stage A)

Algorithmic Collective Action with Two Collectives | Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency

You will be notified whenever a record that you have chosen has been cited.

dl.acm.org

1 2 5

Nick Vincent @nickmvincent.bsky.social · Jun 24

“Attentional agency” — talk in new stage b at facct in the session right now!

Nick Vincent @nickmvincent.bsky.social · Jun 21

Off to FAccT; Excited to see faces old and new!

Nick Vincent @nickmvincent.bsky.social · Jun 5

Another blog post: a link roundup on AI's impact on jobs and power concentration, another proposal for Collective Bargaining for Information, and some additional thoughts on the topic:

dataleverage.substack.com/p/on-ai-driv...

On AI-driven Job Apocalypses and Collective Bargaining for Information

Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.

dataleverage.substack.com

Nick Vincent @nickmvincent.bsky.social · May 28

Post 2: dataleverage.substack.com/p/each-insta...

Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking

It's ranking information all the way down.

dataleverage.substack.com

1 1

Nick Vincent @nickmvincent.bsky.social · May 27

Do some aspects seem wrong (in the next 2 posts, I get into how these ideas interact w/ reinforcement learning)?

1 1

Nick Vincent @nickmvincent.bsky.social · May 27

arxiv.org/abs/2405.14614

Follow ups coming very soon (already drafted): would love to discuss these ideas with folks. Is this all repetitive with past data labor/leverage work? Are some aspects obvious to you?

Push and Pull: A Framework for Measuring Attentional Agency on Digital Platforms

We propose a framework for measuring attentional agency, which we define as a user's ability to allocate attention according to their own desires, goals, and intentions on digital platforms that use s...

arxiv.org

1 2

Nick Vincent @nickmvincent.bsky.social · May 27

This has implications for Internet policy, for understanding where the value in AI comes from, and for thinking about why we might even consider a certain model to be "good"!

This first post leans heavily on recent work with Zachary Wojtowicz and Shrey Jain, to appear at this upcoming FAccT

1 1

Nick Vincent @nickmvincent.bsky.social · May 27

New data leverage post: "Google and TikTok rank bundles of information; ChatGPT ranks grains."

dataleverage.substack.com/p/google-and...

This will be post 1/3 in a series about viewing many AI products as all competing around the same task: ranking bundles or grains of records made by people.

Google and TikTok rank bundles of information; ChatGPT ranks grains.

Google and others solve our attentional problem by ranking discrete bundles of information, whereas ChatGPT ranks more granular chunks. This lens can help us reason about AI policy.

dataleverage.substack.com

1 1 3

Nick Vincent @nickmvincent.bsky.social · May 2

Pre-print now on arxiv and to appear at FAccT 2025:

arxiv.org/abs/2505.00195

"Algorithmic Collective Action with Two Collectives --
Aditya Karan, Nicholas Vincent, Karrie Karahalios, Hari Sundaram"

Algorithmic Collective Action with Two Collectives

Given that data-dependent algorithmic systems have become impactful in more domains of life, the need for individuals to promote their own interests and hold algorithms accountable has grown. To have ...

arxiv.org

1 1