Ben Lipkin
@benlipkin.bsky.social
1K followers 280 following 25 posts
phd @ mit, research @ genlm, intern @ apple https://benlipkin.github.io/
Posts Media Videos Starter Packs
Pinned
benlipkin.bsky.social
Many LM applications may be formulated as text generation conditional on some (Boolean) constraint.

Generate a…
- Python program that passes a test suite.
- PDDL plan that satisfies a goal.
- CoT trajectory that yields a positive reward.
The list goes on…

How can we efficiently satisfy these? 🧵👇
Reposted by Ben Lipkin
gretatuckute.bsky.social
Humans largely learn language through speech. In contrast, most LLMs learn from pre-tokenized text.

In our #Interspeech2025 paper, we introduce AuriStream: a simple, causal model that learns phoneme, word & semantic information from speech.

Poster P6, tomorrow (Aug 19) at 1:30 pm, Foyer 2.2!
Reposted by Ben Lipkin
thomashikaru.bsky.social
1/7 If you're at CogSci 2025, I'd love to see you at my talk on Friday 1pm PDT in Nob Hill A! I'll be talking about our work towards an implemented computational model of noisy-channel comprehension (with @postylem.bsky.social, Ted Gibson, and @rplevy.bsky.social).
benlipkin.bsky.social
Great question! We have not directly compared. SMC offers a test-time approach to steer off-the-shelf models without additional training, whereas diffusion forcing trains autoregressive models to more effectively sample from the global target. These strategies can actually likely be combined.
benlipkin.bsky.social
And check out the following papers, which set the technical landscape for this work to build on:

Lew et al (2023): arxiv.org/abs/2306.03081
Loula et al (2025): arxiv.org/abs/2504.13139
benlipkin.bsky.social
Want to use AWRS SMC?

Check out the GenLM control library: github.com/genlm/genlm-...

GenLM supports not only grammars, but arbitrary programmable constraints from type systems to simulators.

If you can write a Python function, you can control your language model!
benlipkin.bsky.social
Why does AWRS work?

Formal and empirical runtime analyses tell a fascinating story.

AWRS scales adaptively with the KL divergence between the conditional and base token-level models.

As your LM better understands the constraint, AWRS gets faster.

As the LM struggles, AWRS closes the gap.
benlipkin.bsky.social
We tested AWRS SMC on several controlled generation tasks, from text-to-SQL to PDDL goal inference to molecular synthesis.

AWRS SMC outperforms baselines by large margins, e.g., see the jump from 3% -> 53% in the goal inference domain with only ~2.5x clock time overhead.
benlipkin.bsky.social
Next, SMC uses the proposed extensions and corresponding weights from AWRS to update importance weights associated with partial sequences (particles).

These particles are resampled proportional to their weights, re-allocating computation towards the most promising sequences.
benlipkin.bsky.social
First, AWRS reformulates the token-level inference problem from exact enumeration to adaptive rejection sampling.

This process yields equivalently distributed samples at a fraction of the cost.

AWRS then estimates and propagates an importance weight alongside these samples.
benlipkin.bsky.social
So, what can we do?

AWRS SMC is a hierarchical inference framework based on sequential Monte Carlo using a novel stochastic proposal algorithm.

By jointly considering local and global signals, AWRS SMC is both probabilistically sound and sample efficient.

How does it work?
benlipkin.bsky.social
Problem B: LCD distorts the distribution.

Consider this simple LM over the tokens `a` and `b` with the constraint that “strings must end with `a`”.

While the distribution on complete strings favors `ba`, autoregressive sampling will favor `ab`.

We don’t want this.
benlipkin.bsky.social
Problem A: Token masking is often slow.

Must classify all 100,000+ tokens in the vocab at each step.

While regular and context-free grammars support low-overhead solutions using tools like Outlines (dottxtai.bsky.social), open-ended constraint enforcement has been harder.
benlipkin.bsky.social
Approach 2: Locally constrained decoding (LCD).

At each step, mask the next-token distribution to prevent violations.

Pros: All samples are constraint-satisfying.
Cons: A) Masking a large vocabulary is slow. B) LCD distorts the sampled distribution.

Example:
benlipkin.bsky.social
Approach 1: Sample-verify/Best-of-N.

Draw 𝑁 strings from the LM and use the constraint to rank/filter.

Pros: Samples 𝑥 ∝ 𝑃 as 𝑁 grows.
Cons: 𝑁 required to get a target sample scales exp(KL[𝑃||𝑄]). For difficult constraints, this becomes infeasible.

Example:
benlipkin.bsky.social
Consider a prompted language model 𝑄 (a prior) and a constraint function 𝐶 (a likelihood).

Our goal is to sample a string 𝑥 from the conditional distribution 𝑃 = 𝑄(·|𝐶(𝑥)=1) (the target posterior).

How do people do this now, and why do current approaches fall short?
benlipkin.bsky.social
Many LM applications may be formulated as text generation conditional on some (Boolean) constraint.

Generate a…
- Python program that passes a test suite.
- PDDL plan that satisfies a goal.
- CoT trajectory that yields a positive reward.
The list goes on…

How can we efficiently satisfy these? 🧵👇
Reposted by Ben Lipkin
joaoloula.bsky.social
#ICLR2025 Oral

How can we control LMs using diverse signals such as static analyses, test cases, and simulations?

In our paper “Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo” (w/ @benlipkin.bsky.social,
@alexlew.bsky.social, @xtimv.bsky.social) we:
Reposted by Ben Lipkin
kmahowald.bsky.social
I might be able to hire a postdoc for this fall in computational linguistics at UT Austin. Topics in the general LLM + cognitive space (particularly reasoning, chain of thought, LLMs + code) and LLM + linguistic space. If this could be of interest, feel free to get in touch!
benlipkin.bsky.social
Big thanks to this awesome team: Ben LeBrun, @postylem.bsky.social, @joaoloula.bsky.social, @drmaciver.bsky.social, Li Du, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Tim O'Donnell, @alexlew.bsky.social, @xtimv.bsky.social
benlipkin.bsky.social
New preprint on controlled generation from LMs!

I'll be presenting at NENLP tomorrow 12:50-2:00pm

Longer thread coming soon :)