Lightnews — Scholar-powered news

vik / λh.(h h)

@vikhyat.net

1K followers 250 following 52 posts

teaching computers how to see

Posts Replies Media Videos

vik / λh.(h h)

@vikhyat.net

you sound like a markov chain. just parroting random things without any underlying comprehension

October 5, 2025 at 5:33 PM

vik / λh.(h h)

@vikhyat.net

wrong. that's exactly how human cognition works

October 5, 2025 at 4:56 PM

vik / λh.(h h)

@vikhyat.net

wrong yet highly confident is not a good look

October 5, 2025 at 8:52 AM

vik / λh.(h h)

@vikhyat.net

😭

September 19, 2025 at 2:21 PM

vik / λh.(h h)

@vikhyat.net

yeah… in latest torch on h100s it’s basically the same speed

November 28, 2024 at 7:47 AM

vik / λh.(h h)

@vikhyat.net

bsky.app/profile/jay....

Jay 🦋 @jay.bsky.team · Nov 19

We love links because we love the open web

nilay patel @reckless.bsky.social · Nov 18

It is so incredibly nice to have a social platform that values links again - a breath of fresh air in the poisoned ecosystem of the creator economy

November 28, 2024 at 5:52 AM

vik / λh.(h h)

@vikhyat.net

i heard they have prebuilt wheels but you have to go to the github repo to find them... just stopped using flash attention instead

November 28, 2024 at 5:51 AM

vik / λh.(h h)

@vikhyat.net

I noticed that my posts are not in this dataset, may I ask why? Have I offended you somehow?

November 27, 2024 at 8:45 PM

vik / λh.(h h)

@vikhyat.net

💀

November 22, 2024 at 10:55 AM

vik / λh.(h h)

@vikhyat.net

while we're ruining your weekend here's another one i think you'll like 😂 arxiv.org/pdf/2410.06205

arxiv.org

November 22, 2024 at 10:45 AM

vik / λh.(h h)

@vikhyat.net

my post was prompted by this paper btw i just realized i didn't share context arxiv.org/abs/2411.13476

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

Extending context window sizes allows large language models (LLMs) to process longer sequences and handle more complex tasks. Rotary Positional Embedding (RoPE) has become the de facto standard due to...

arxiv.org

November 22, 2024 at 10:42 AM

vik / λh.(h h)

@vikhyat.net

is the amazing length extrapolation it enables really worth all of the suffering entailed in debugging when it breaks? 🥲

November 22, 2024 at 10:30 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news