Lightnews — Scholar-powered news

Eytan Adar @eytan.adar.prof · 7d

That's an interesting point, but that's not really due to recorded music is it? We sit around a TV, not a record player or stereo (though maybe those technologies were the bridge).

1

Eytan Adar @eytan.adar.prof · 15d

I should also add that you don't need the LLM/VLM to do this... you can have humans do some (or all) of the algorithm. It's just that the LLM/VLM solution makes it scale and can highlight images for editors that they might not have considered. (6/5)

Eytan Adar @eytan.adar.prof · 15d

Images selected in this way are better representatives of the main properties of the concept, while also highlighting what makes it different from other, related concepts. The paper with more details is here: arxiv.org/abs/2509.15059 (5/5)

QuizRank: Picking Images by Quizzing VLMs

Images play a vital role in improving the readability and comprehension of Wikipedia articles by serving as `illustrative aids.' However, not all images are equally effective and not all Wikipedia edi...

arxiv.org

1

Eytan Adar @eytan.adar.prof · 15d

We also extended this idea to build a Contrastive QuizRank. Instead of just considering what is interesting about the concept (e.g., the Western Bluebird), we let the LLM figure out what makes it different from a "distractor" (e.g., the Mountain Bluebird). (4/5)

Image of a mountain bluebird which is entirely blue

1

Eytan Adar @eytan.adar.prof · 15d

So if I ask: "what is the chest color of the Western Bluebird?", a good image helps you answer correctly (orange). QuizRank uses an LLM to generate questions (based on the article) and a VLM to take the test. Images that help the VLM do well on the test are better and are ranked more highly. (3/5)

1

Eytan Adar @eytan.adar.prof · 15d

For example, given the 234 different images of the Western Bluebird, which should I pick? Our intuition is that with a "good" instructional image, someone should be able to answer questions about important visual properties of the concept better than with a "bad" image. (2/5)

1

Eytan Adar @eytan.adar.prof · 15d

Fun paper from a recent project with UMich alumni Tenghao Ji. Can we pick better images for Wikipedia articles? Given all the choices in the Wikipedia Commons, which image is best as an "instructional aid?" (1/5)

An image of a male western bluebird. The chest is a brown color and the back and head are blue.

1 1 8

Eytan Adar @eytan.adar.prof · 23d

💩

2

Reposted by Eytan Adar

Sarita @saritas.bsky.social · 23d

John Derby Evans Professorship in Information (Assistant or Associate Professor) at UMSI. Apps due Nov 1. www.si.umich.edu/people/facul...

John Derby Evans Professorship in Information (Assistant or Associate Professor) | umsi

The University of Michigan School of Information (UMSI) invites applications for a tenure-track faculty position focusing on technology and society.

www.si.umich.edu

2 1

Eytan Adar @eytan.adar.prof · 23d

And by *cool, I mean the research. Not what it implies

Eytan Adar @eytan.adar.prof · 23d

Very cool... We found something related in privacy. Affordances/design->shifts in perceived norms around sharing (which deviate from actual norms)->increased sharing by individual->increased sharing by community

1 1

Eytan Adar @eytan.adar.prof · 27d

I'm not that mean... I only hate your single-author papers... :) (but seriously, good luck. We're too old to stay up this late)

1

Eytan Adar @eytan.adar.prof · 27d

to arxiv? yes... probably

1 2

Eytan Adar @eytan.adar.prof · 27d

@davidjurgens.bsky.social may have pointers

1

Eytan Adar @eytan.adar.prof · Aug 23

We need something more technical sounding than "co-evolution" to describe this... RHLAF: reinforcing human learning through AI feedback? HAAF: human adaptation through AI feedback? :)

Tim Kellogg @timkellogg.me · Aug 23

As LLMs Improve, People Adapt Their Prompts

a study shows that a lot of the real world performance gains that people see are actually because people learn how to use the model better

arxiv.org/abs/2407.14333

The chart presents the decomposition of Average Treatment Effect (ATE) on cosine similarity into two components: Model Effect (red) and Prompting Effect (blue).
• Y-axis: Δ Cosine Similarity (change in similarity).
• X-axis: The source of prompts (top labels) and the replay model used (bottom labels).
• Points and error bars: Represent mean effects with 95% confidence intervals, bootstrapped and clustered by participant.

Breakdown:
1. DALL-E 2 → DALL-E 2 (baseline): Δ Cosine Similarity is ~0, establishing the reference point.
2. DALL-E 2 prompts replayed on DALL-E 3: Shows a Model Effect (increase ~0.007–0.008). This isolates the improvement attributable to the newer model when given the same prompts.
3. DALL-E 3 prompts replayed on DALL-E 3 vs DALL-E 2 prompts on DALL-E 3: The additional boost is attributed to the Prompting Effect (~0.006–0.007).
4. Total ATE (black bracket): When prompts written for DALL-E 3 are used on DALL-E 3, the improvement in cosine similarity reaches ~0.016–0.018.
5. DALL-E 3 prompts replayed on DALL-E 2: Effect is small, close to baseline, showing the limited benefit of improved prompts without the newer model.

Summary (from caption):
• ATE (black) = Model Effect (red) + Prompting Effect (blue).
• Model upgrades (DALL-E 3 vs DALL-E 2) and better prompt designs both contribute to improved performance.
• Prompting alone offers some gains, but most improvements come from model advancements.

1 1 11

Eytan Adar @eytan.adar.prof · Jul 20

An extended version from Calvino: www.goodreads.com/quotes/25995...

A quote from If on a Winter’s Night a Traveler

Sections in the bookstore- Books You Haven't Read- Books You Needn't Read- Books Made for Purposes Other Than Reading- Books Read Even Before You Op...

www.goodreads.com

3

Eytan Adar @eytan.adar.prof · Jul 17

I'm very curious... Are cheese curlers often used in the bathtub? Was the art director like, "how do I show that our cheese curler is rust proof?"

An Amazon ad for a cheese curler being used in a bathtub

3

Eytan Adar @eytan.adar.prof · Jul 16

+1 to dspy for implementing a few different options: dspy.ai/learn/optimi...

Optimizers - DSPy

The framework for programming—rather than prompting—language models.

dspy.ai

2

Eytan Adar @eytan.adar.prof · Jul 1

It's just a mess of a reviewing process. Forced reviewing, leading to non expert, low quality, short reviews that arrive late (well into rebuttal period), etc. they make a slightly random process very random.

2

Eytan Adar @eytan.adar.prof · Jul 1

If you depend on EMNLP/ARR for your publishing or for hiring/promotion, I feel bad for you.

2 2

Eytan Adar @eytan.adar.prof · Jun 23

I'm not sure who recommended it to me, but let me pass it on since it was a fun read: "Get the Picture" by Bianca Bosker

2

Eytan Adar @eytan.adar.prof · Jun 6

One more for your collection, Lecons de Statique Graphique, Favaro, 1885

1

Eytan Adar @eytan.adar.prof · Jun 5

Lotka, 1926... so "contemporary" :)

3d plot of model of age distribution in US population

Eytan Adar @eytan.adar.prof · May 21

I still like it... ;) I'm just ACing some conferences so the struggle is on my mind

Eytan Adar @eytan.adar.prof · May 21

I like it, but I've been noticing increasing numbers of conflicts with some communities doing lots of co-authoring and big vision papers that include everyone.

1 1