Lightnews — Scholar-powered news

Manoel Horta Ribeiro

@manoelhortaribeiro.bsky.social

1.2K followers 380 following 160 posts

Assistant Professor @ Princeton Previously: EPFL 🇨🇭, UFMG 🇧🇷 Interests: Computational Social Science, Platforms, GenAI, Moderation

Posts Media Videos Starter Packs

Pinned

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · Nov 14

I am recruiting 1-2 PhD students for Fall 2025 at Princeton to work on Comp Social Science/Societal Impact of GenAI/GenAI for SocSci

I wrote a bit on research flavor & interests here: manoelhortaribeiro.github.io/advising

Deadline: December 15th www.cs.princeton.edu/grad#prospec...

Please boost!

2 35 50

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 2d

None of this is `hard'—great material already exists (Brady Neal on causality, Moritz Hardt on benchmarks, etc.). What's missing is mindset: causality, regression, and experimental design must become core to how we train computer scientists—not optional extras.

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 2d

I elaborate on what I think should be taught. It boils down to (at least) four things:
1 causality: how to pose and identify effects
2 regression: as a tool for inference, not prediction
3 benchmarks: as measurements, not trophies
4 experiments: with rigor, power, and ethics

1 1 4

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 2d

Success is measured by benchmarks, not by robustness or causal clarity. Yet more and more papers now make causal claims --- from HCI to NLP, ML to Security and Privacy.

2 1

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 2d

Why the contrast? Because the two fields treat empiricism in opposite ways.

Econometrics was forged in the crucible of skepticism. Every paper is a defensive war against omitted variables, selection bias, etc. Yet, CS (and ML) was built on demonstration, not falsification ...

1 1

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 2d

I'd posit a similar, flipped version of the law for ML:

> When an economist reads (and understands) an empirical machine learning study done after 2022, the probability that they will think of an objection that the researcher has failed to take into account is close to one.

1 2

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 2d

Henderson’s first law of econometrics reads:

> When you read an econometric study done after 2005, the probability that the researcher has failed to take into account an objection that a non-economist will think of is close to zero.

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 2d

Computer Science is no longer just about building systems or proving theorems--it's about observation and experiments.

In my latest blog post, I argue it’s time we had our own "Econometrics," a discipline devoted to empirical rigor.

doomscrollingbabel.manoel.xyz/p/the-missin...

2 9 27

Reposted by Manoel Horta Ribeiro

Francesco Salvi @frasalvi.bsky.social · 4d

🌱✨ Life update: I just started my PhD at Princeton University!

I will be supervised by @manoelhortaribeiro.bsky.social and affiliated with Princeton CITP.

It's only been a month, but the energy feels amazing —very grateful for such a welcoming community. Excited for what’s ahead! 🚀

2 5

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

(As in, a reasonable cost for us, we'd be happy to host it for research purposes)

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

We are planning to, although we need to improve the current system to make it scalable (at a reasonable cost) 😅

1 3

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

Learned a lot in this project working with @omelmalki.bsky.social, @andresmh.com @mariannealq.bsky.social!

This work was inspired by a swathe of excellent work reimagining social media by @jonathanstray.bsky.social @mbernst.bsky.social @tiziano.bsky.social @micahcarroll.bsky.social, and others

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

Bonsai is modular and platform-agnostic, opening paths for integration beyond Bluesky. The paper details the backend design, study, and implications: arxiv.org/abs/2509.10776

1 1

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

So what? Designing around intent gives users greater agency and alignment, but also increases curation effort. Future systems should pair transparent pipelines with lightweight interfaces to make intentional feedbuilding practical and sustainable.

1 5

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

Participants highlighted the tradeoff between agency and convenience, suggesting that greater control often comes with higher cognitive/interactional costs. Some participants described Bonsai as “a feed that finally matched what I came here for,” highlighting the promise of intentional feedbuilding!

1 4

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

Participants used Bonsai to find content aligned with their goals, filter out noise, and separate engagement from intent—transforming feeds into tools for research, connection, or focus rather than distraction. At the same time, intentional curation demanded more effort than passive scrolling.

1 4

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

We implemented Bonsai on Bluesky and conducted a two-phase, multi-week study with 15 participants. This deployment allowed us to observe how people used intentional feedbuilding in practice, and how it compared to their experiences with engagement-driven defaults.

1 1 6

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

In the ranking stage, Bonsai orders the curated content using criteria derived from the user’s stated intent—rather than predicted engagement—making the logic behind feed prioritization transparent and directly aligned with user goals.

1 1 3

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

In the curating stage, users can apply natural language prompts (e.g., “focus on recent policy updates” or “exclude promotional posts”) to filter and organize the sourced content, ensuring the feed reflects users' goals / preferences. Each prompt is fed into an LLM that individually ranks content.

1 1 4

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

In the sourcing stage, Bonsai gathers a wide pool of candidate posts aligned with user goals. Users can refine this stage by editing sources (adding or removing accounts, hashtags, or feeds) to shape where their feed draws content from.

1 1 3

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

In the planning stage, users express their goals in natural language (e.g., “updates on AI policy” or “posts from close colleagues”). Bonsai translates these goals into structured representations that guide the subsequent sourcing, curating, and ranking of content by providing initial suggestions.

1 1 3

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

With Bonsai, users can articulate what they want from their feeds (e.g., tracking research, staying informed on a policy area, or connecting with a community) and the system procedurally builds a feed that reflects those intentions in four steps, which we discuss below.

1 1 4

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

Bonsai sits within a broader debate on recommender systems. While TikTok or Meta optimize (mostly) for attention capture, Bonsai explores what feeds look like when personalized for user intent. Under our taxonomy, it explores the design space of "intentional" and "personalized" feeds!

1 1 9

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · 21d

Social media feeds today are optimized for engagement, often leading to misalignment between users' intentions and technology use.

In a new paper, we introduce Bonsai, a tool to create feeds based on stated preferences, rather than predicted engagement.

arxiv.org/abs/2509.10776

5 46 150

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · Aug 25

This is so cool

Manoel Horta Ribeiro @manoelhortaribeiro.bsky.social · Aug 25

It suggests understanding isn’t all-or-nothing. It can emerge in layers, e.g., structural vs embodied. So the key question isn’t “Do LMs understand?” but what kind of understanding do they have, and is that enough for what we need them to do?