vik / λh.(h h)
banner
vikhyat.net
vik / λh.(h h)
@vikhyat.net
teaching computers how to see
you sound like a markov chain. just parroting random things without any underlying comprehension
October 5, 2025 at 5:33 PM
wrong. that's exactly how human cognition works
October 5, 2025 at 4:56 PM
wrong yet highly confident is not a good look
October 5, 2025 at 8:52 AM
😭
September 19, 2025 at 2:21 PM
yeah… in latest torch on h100s it’s basically the same speed
November 28, 2024 at 7:47 AM
We love links because we love the open web
It is so incredibly nice to have a social platform that values links again - a breath of fresh air in the poisoned ecosystem of the creator economy
November 28, 2024 at 5:52 AM
i heard they have prebuilt wheels but you have to go to the github repo to find them... just stopped using flash attention instead
November 28, 2024 at 5:51 AM
I noticed that my posts are not in this dataset, may I ask why? Have I offended you somehow?
November 27, 2024 at 8:45 PM
💀
November 22, 2024 at 10:55 AM
while we're ruining your weekend here's another one i think you'll like 😂 arxiv.org/pdf/2410.06205
arxiv.org
November 22, 2024 at 10:45 AM
my post was prompted by this paper btw i just realized i didn't share context arxiv.org/abs/2411.13476
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Extending context window sizes allows large language models (LLMs) to process longer sequences and handle more complex tasks. Rotary Positional Embedding (RoPE) has become the de facto standard due to...
arxiv.org
November 22, 2024 at 10:42 AM
is the amazing length extrapolation it enables really worth all of the suffering entailed in debugging when it breaks? 🥲
November 22, 2024 at 10:30 AM