Thiago Serra
thserra.bsky.social
Thiago Serra
@thserra.bsky.social
Assistant professor at University of Iowa, formerly at Bucknell University, mathematical optimizer with an #orms PhD from Carnegie Mellon University, curious about scaling up constraint learning, proud father of two
This is not different from how our implementation of Tsiourvas and Perakis's MIP Walk (called SM in the plot) actually performs better for smaller inputs when compared with SimplexWalk (called RW in the plot): it is a tradeoff of problem size and algorithm complexity.

10/N
February 17, 2026 at 4:38 PM
In fact, if you look closely at the cases with smaller depth and width, SimplexWalk actually performs better.

9/N
February 17, 2026 at 4:38 PM
Our algorithms produce better solutions for the same time limit, performing better under shorter runtimes and over neural networks with larger input size (a reliable proxy for the number of linear regions).

7/N
February 17, 2026 at 4:36 PM
Now we continue on that trend of reducing per-iteration cost to obtain better solutions within a time limit in much larger networks.

We do that with "Gradient Walk": we try naive and tailored gradient-based algorithms. They all take more steps to converge, but each step is much cheaper.

6/N
February 17, 2026 at 4:35 PM
We can frame that earlier work as "MIP Walk": solve a smaller MILP around the current solution to find a better one.

Then we can frame our follow up work as "LP Walk": solve an LP over the inputs associated with current set of active neurons, then continue moving along that same direction.

4/N
February 17, 2026 at 4:33 PM
Essentially, we are looking for better solutions over piecewise-linear functions modeled by neural networks. What defines each linear piece is what neurons are active or not in a given part of the input space. While we can model and try to solve it with MILP, it does not scale very well.

2/N
February 17, 2026 at 4:30 PM
I am happy to share about a new paper with Jiatai Tong, Yilin Zhu, and Sam Burer — which also just got accepted at CPAIOR 2026:

Optimization over Trained Neural Networks: Going Large with Gradient-Based Algorithms

1/N
February 17, 2026 at 4:29 PM
Those irrelevant perturbations are smaller but still present in tabular foundation models, as observed with in-context learning over TabPFN.

Yang suggests this as a natural limitation on the use of this type of model for this type of task, which is a reasonable perspective.

4/4
February 15, 2026 at 2:02 PM
To investigate if such variations could be due to randomness instead of lack of robustness, at least in the simpler but less competitive case of in-context learning, Yang evaluated attention scores in the open model Llama-38-B. The scores varied by row order, variable names, & variable values.

3/4
February 15, 2026 at 1:55 PM
In the simple setting of a linear function as ground truth, Yang found that supervised fine-tuning of gpt-4o is competitive with commonly used regression techniques—although being an absolute overkill. However, it is sensitive to task-irrelevant variations, such as change in number of digits.

2/4
February 15, 2026 at 1:44 PM
What can’t LLMs do? If you get caught using them for data fitting, you deserve a warning.

Mochen Yang talked at Iowa’s Tippie College of Business about the caveats of using LLMs for that purpose. After all, getting gpt-4o to predict 0.1875 entails predicting tokens “0”, “.”, “187”, and “5”!

1/4
February 15, 2026 at 1:30 PM
Not that successful in tweaking this with AI…
December 27, 2025 at 2:36 AM
Make a Bond movie academic:

Facets are Forever
December 27, 2025 at 2:13 AM
Make a Bond movie academic:

(Sean Connery is unfunded)

You Only Travel Twice
December 27, 2025 at 1:54 AM
Make a Bond movie academic:

License to Queue

#orms
December 27, 2025 at 1:45 AM
Looking for an auspicious PhD? Try it the Iowa way!

We have a top-notch department with active research at the intersection of many important fundamental and emerging subjects. There is still time—the application deadline is three weeks away! More information here: tippie.uiowa.edu/phd/phd-busi...
December 23, 2025 at 2:14 PM
Happy to have contributed to Anna Mitchell’s cover story at The Daily Iowan about data centers in Iowa.

You can read it here: dailyiowan.com/2025/12/09/c...
December 12, 2025 at 1:40 PM
Today I am celebrating the symbolic thousandth citation on Google Scholar.
December 6, 2025 at 9:10 PM
Next week, I will be participating on a panel hosted by @gurobioptimization.bsky.social about training the next generation of mathematical optimizers.

You can join the conversation here: www.brighttalk.com/webcast/1918...
December 4, 2025 at 8:27 PM
And that’s a wrap for #informs2025

See you next year in San Francisco!
October 29, 2025 at 1:08 AM
In Hans Mittelmann’s presentation on optimization benchmark results at #informs2025, he raised the point that benchmarking enters a new era with AI being used under the hood by solvers in one way or another.
October 28, 2025 at 1:14 PM
Thinking of my PhD days, my recollection of Shabbir Ahmed was as one of those senior professors who would come, chat, and give advice to other people’s PhD students in workshops and conferences. He was a nice guy who left this world too early.
October 28, 2025 at 1:03 PM
The University of Iowa is launching a new ranking of business analytics departments (including equivalent departments in other business schools) at this year’s @informs.bsky.social Annual Meeting.

1/3
October 26, 2025 at 4:07 PM
Wrapping up, I caught up with former colleagues and students from Bucknell University who are here for the conference.

#informs2025
October 26, 2025 at 3:50 PM
Later, I sat again with Kayse Lee Maass for the panel Research Pathways at the MIF Undergraduate Workshop, hosted by Trilce Encarnacion and Ruben Proano Morales (thank you Austin Saragih for the pic!)

2/3
October 26, 2025 at 3:49 PM