Lightnews — Scholar-powered news

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 31

Finished reading this really nice survey paper. Love reading survey papers as a method to revise and fill in gaps in my understanding of a topic ..

arxiv.org/abs/2406.16838

From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models

One of the most striking findings in modern research on large language models (LLMs) is that scaling up compute during training leads to better results. However, less attention has been given to the b...

arxiv.org

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 14

Couldn't present my own poster at neurips workshop because I couldn't go to the poster hall with my baby ... apparently some workshops are 14 years + .. why 14+ ? No such issue with main conference .. @neuripsconf.bsky.social

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 13

@neuripsconf.bsky.social kudos ... a conference at this scale and such high quality .. truly mindblowing ...

Reposted by Ramesh Manuvinakurike

Nathan Lambert @natolambert.bsky.social · Dec 9

First slide deck for NeurIPS is done -- an overview of how I view post-training for applications.
A higher level summary on the key decisions along the way of scoping a problem, choosing a base model, optimization algorithm, etc. (+some thoughts on OpenAI's RL Finetuning).

https://buff.ly/3ZpY5IR

1 4 34

Reposted by Ramesh Manuvinakurike

Stanford NLP Group @stanfordnlp.bsky.social · Dec 9

The extraordinary recent takeover of ML/AI by #NLP is well-known but insufficiently reflected on.

Look at the @neuripsconf.bsky.social tutorials in 2024!

neurips.cc/virtual/2024...

14 tutorials; 6 have "LLM" in the title; 4 more cover foundation models, with large NLP coverage. That's > 70% 😲

NeurIPS 2024 TutorialsNeurIPS 2024

neurips.cc

1 14 64

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 9

@neuripsconf.bsky.social has some awesome tutorials list ... If only time turner was available ... Looking forward to these !

1

Reposted by Ramesh Manuvinakurike

Ben Burtenshaw @benburtenshaw.bsky.social · Dec 3

For anyone interested in fine-tuning or aligning LLMs, I’m running this free and open course called smol course. It’s not a big deal, it’s just smol.

🧵>>

9 64 330

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 5

When I speak to non-tech people, I get equally scared and encouraged. One one hand they truly underestimate the power of AI and on the other they don't care much about it ...

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 4

In-context learning can be very difficult to understand. This thread from a long time back has some interesting points ...

www.reddit.com/r/MachineLea...

From the MachineLearning community on Reddit

Explore this post and more from the MachineLearning community

www.reddit.com

2

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 4

Totally ... In our reading group we were doubting the presenter when they presented it !! It was an intern who presented and it definitely made them annoyed as to why we weren't able to understand "such a simple" concept ...

Knowledge of T5 helped some of us though ...

1

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 4

Totally stealing this for my next presentation next week ... 🤣

1

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 4

Dspy + Gradio + Huggingface = Magic !!

1 5

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 4

Listening to this awesome talk from @cgpotts.bsky.social .. so in love with the message here ..

As I'm building systems the most common questions (and review comments) I get asked is about the LL(M)M I'm using and not the systems and the problems they're solving ..

youtu.be/vRTcE19M-KE?...

youtu.be

2 3 7

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 3

I wonder if we can create "crying mindlessly" mode in the LLMs. You know, a mode where a baby cries a lot and stops immediately when shown something completely random and uninteresting ...

Ramesh Manuvinakurike @rameshddrr.bsky.social · Dec 1

One abilities we humans have is to select role models depending on our likings. I aspire to be as nice and humble as my primary school friend from 7th grade and not rude and arrogant as a certain billionaire. Our ability select the samples we train our policy astonishes me !!!

1

Ramesh Manuvinakurike @rameshddrr.bsky.social · Nov 30

One of the first papers I read during my PhD was "Referring as a Collaborative Process". I tried to test out the ChatGPT and Claude on some of the problems mentioned.

For instance, something as simple as self - repairs in the same utterance confuses the model .. hmm ..

1