Lightnews — Scholar-powered news

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Jan 28

Honestly, it feels like as an AI researcher it might actually be worth it to throw your dignity aside and pay Elon for Twitter blue to advertise your papers. Getting papers famous is literally just a social media clout game now.

1 4

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 23

See the second part of my post - yes, they are likely using explicit search to improve performance at test time. But the focus should be on the search through reasoning chains itself, which the model has been trained to do with RL. Even for the explicit search, you require the RL value functions.

Reposted by Siddarth Venkatraman @ NeurIPS 2024

Benno Krojer @bennokrojer.bsky.social · Dec 22

Few fields reward quick pivoting as much as AI, or vice versa punish the very thing a phd is usually meant to be: stick with one research direction for 5 years no matter what, go really deep, becoming a niche expert

for your research to be relevant in AI, you might wanna pivot every 1-2 years

3 2 15

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 22

I think the intersection of builders and researchers is higher in machine learning, compared to other disciplines.

1

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 22

You could still wrap this with explicit search techniques like MCTS if you have value functions for partial sequences (which would also be a product of the RL training). This could further improve performance, similar to fast vs slow policy in AlphaZero.

3

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 22

Saying o3 is just a “more principled search technique” is quite reductive. The o series of models don’t require “explicit search” strategies in the form of tree search, wrapped in loops etc. Instead, RL is used to train the model to “learn to search” using long CoT chains.

Mark Riedl @markriedl.bsky.social · Dec 21

Six months ago someone put a for-loop around GPT-4o and got 50% on the ARC-AGI test set and 72% on a held-out training set redwoodresearch.substack.com/p/getting-50... Just sample 8000 times with beam search.

o3 is probably a more principled search technique...

Getting 50% (SoTA) on ARC-AGI with GPT-4o

You can just draw more samples

redwoodresearch.substack.com

3 2

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 22

You’re correct, there’s plenty of simulated environments we can’t solve yet. But do you consider having 1 million parallel instances of an environment sped up 100x solving it with PPO with low wall clock time a desirable solution?

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 21

This isn’t a general solution to RL. The point is to make learning algorithms sample efficient. If the environment you are doing RL on is the real world, you can’t make the “environment go fast”.

With “infinite samples”, you can random sample policies till you stumble on one with high reward.

1 5

Reposted by Siddarth Venkatraman @ NeurIPS 2024

Luca Scimeca @lucascimeca.bsky.social · Dec 12

Come check out our neurips poster today! We will be at West Ballroom #7101 from 4:30pm - 7:30pm.

Website: github.com/gfnorg/diffu...

GitHub - GFNOrg/diffusion-samplers

Contribute to GFNOrg/diffusion-samplers development by creating an account on GitHub.

github.com

1 1

Reposted by Siddarth Venkatraman @ NeurIPS 2024

Reinforcement Learning Conference @rl-conference.bsky.social · Dec 10

If you're at NeurIPS, RLC is hosting an RL event from 8 till late at The Pearl on Dec. 11th. Join us, meet all the RL researchers, and spread the word!

2 18 63

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 7

Even his current claim that o1 is “better than most humans in most tasks” is pretty wild imo. What are “most tasks” here even? Obviously not any physical tasks because there is no embodiment. Can o1 actually completely replace a human in any job? Can it manage a project from start to finish?

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 7

x.com/vahidk/statu...

x.com

1

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 7

It also doesn’t help when OpenAI staff post about how o1 is already AGI (yes this happened today).

Unfortunately the dialogue is directed by those on either end of the spectrum (AI is useless vs AGI is already here) without much room for nuance.

1 4

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 6

www.newsweek.com/united-healt...

I have anecdotal evidence from a friend who works at a client company for a popular insurance firm. They are using shitty “AI models” which are basically just CatBoost to mass process claims. They know the models are shit, but that’s also the point. Truly sickening.

A year before CEO shooting, lawsuit alleged UHC used AI to deny coverage

The lawsuit accuses UnitedHealthcare of using artificial intelligence to deny coverage to elderly patients.

www.newsweek.com

2

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 6

It is reductive to blame it all on a single CEO, but I find it hard to believe how you are “shocked” by this public reaction. UHC has the highest claim denial rate among insurance providers, resulting in untold medical bankruptcies and preventable deaths. I’m shocked this doesn’t happen more often.

1 5

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 1

Subtlety and nuance go out the window when strong political feelings are thrown in the mix. I understand why AI researchers can get defensive/angry due to toxic comments, but we should still try to understand the origin of people’s anger. Imo, right wing AI silicon valley billionaires are the root.

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 1

I think the recent conflict between AI researchers and the anti-AI clique hints at the latter. This broad left leaning user base could fracture again as differences in opinions between the farther left and moderate factions get amplified.

1

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Dec 1

This app is an interesting social experiment. Assuming Bluesky doesn’t just fizzle out, will hostile social relations as in Twitter resurface here too? If hostilities do return, will it be because conservatives come to this app, or will it be new political tensions within left leaning communities?

1

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Nov 30

Another thing; let’s reflect if they actually have a point. When I deeply reflect upon it, I am not even personally convinced that in the grand scheme of things AI is going to be a net good for humanity. So, maybe the distaste is warranted and we’re the ones in the bubble?

1 2

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Nov 30

As AI researchers, we shouldn’t demonize people outside our space who have a passionate distaste for AI. You have to understand that most of the pro-AI sentiment people see online comes from absolutely vile “AI-bros”, especially on twitter. We just need to distinguish ourselves as academics.

1 1 3

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Nov 30

Yeah, it will definitely not be “true OT” at end, but it works to get surprisingly smooth ODE paths which can be easily numerically integrated. You can train a CIFAR 10 flow model which can generate high quality images with 5-10 Euler steps.

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Nov 30

You can do minibatch OT coupling to get actual optimal transport flows with simulation free training.

arxiv.org/abs/2302.00482

Improving and generalizing flow-based generative models with minibatch optimal transport

Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their simulation-based maximum likelihood training. We introduce the...

arxiv.org

1

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Nov 29

Sure, that argument works from a utilitarian perspective.

From monkey brain casual user point of view, it looks ugly and outdated. And I think this is what should be focused on.

1

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Nov 29

Anyone has thoughts about which generative models are also the best for representation learning features for downstream tasks?

My guess is GANs are a dark horse and the latents carry important abstract features. But we haven’t explored this much since they are hard to train.

1

Siddarth Venkatraman @ NeurIPS 2024 @hyperpotatoneo.bsky.social · Nov 29

You can just have a verification system like the system in pre-Elon twitter, where blue check marks are verified accounts.

2 1