Lightnews — Scholar-powered news

aria 🍊 @aurelium.me · 47m

if you put too much on ICL, you're going to eat into the benefits of DSA-style top-k sparse attention. you'll need to keep increasing your top-k value just to pull in enough context to answer a question

aria 🍊 @aurelium.me · 52m

probably, but I am inclined to believe that an 80B-A1B model with a standard amount of stuff memorized will be a much better experience than even an ideal 1B "cognitive core" model with web search, at the cost of a bit of SSD space

1

aria 🍊 @aurelium.me · 59m

imo, the key to the "cognitive core" is less fitting more and more capabilities into fewer parameters and more cranking sparsity up to a degree that makes total parameter count irrelevant

1 2

aria 🍊 @aurelium.me · 1h

i don't entirely like that this is the case, tbh. Qwen's inflation of benchmark scores by training on billions of tokens of "technically not the test set" is, imo, dishonest and does bad things to their models' downstream performance while also disadvantaging anyone who doesn't do the same

aria 🍊 @aurelium.me · 1h

at Arcee our post-training library has custom model implementatons built in for two architectures: Qwen and our own

they have fully taken the mantle of the Standard Open Model from llama at this point

1

aria 🍊 @aurelium.me · 21h

i work a schedule I call "???10". i work an unspecified amount of time a day at random times, 90% of which is waiting idly for things to finish

3

aria 🍊 @aurelium.me · 21h

can confirm. right now I am "working on machine learning" by reading posts and eating potato chips while I wait 30 minutes to see if a bug manifests on an earlier checkpoint

1

aria 🍊 @aurelium.me · 1d

they're the kind of lab that drops a ~solution to the problem of escalating compute costs for long-context inference in a release called "V3.2"

1 9

Reposted by aria 🍊

thebes @vgel.me · 1d

claude sonnet 3.6's yellowstone vacation

1 1 14

aria 🍊 @aurelium.me · 1d

the base text-completion LLM definitely *can* be genuinely bigoted in a convincing way. it's not outside of its capabilities

I just think that if you optimize in post-training for "be a useful chatbot", the model will naturally kick itself out of basins associated with misanthropy and hatred

1 2

aria 🍊 @aurelium.me · 1d

if only.

5

aria 🍊 @aurelium.me · 1d

like them or not the AI-safetyists and AGI true believers are the ones that stop OpenAI (and to a lesser extent anthropic) from becoming slightly-less-partisan xAI

1 6

aria 🍊 @aurelium.me · 1d

that and synthetic porn for the most pathetic men alive

1 4

aria 🍊 @aurelium.me · 1d

it's not as if he revealed any new information. anyone paying attention to anything besides idle speculation about secret proprietary model performance would not find this especially novel

1

aria 🍊 @aurelium.me · 2d

I disagree vehemently with the vision of the world that most of these frontier labs have but it's a far cry between this and that

31

aria 🍊 @aurelium.me · 2d

the only big AI lab whose employees you regularly see publicly arguing to repeal women's suffrage is xAI. at this point anyone who survived the initial exodus(es) is really suspect to me

2 5 130

aria 🍊 @aurelium.me · 2d

holy shit that is a lot of people. and I was one of them, probably!

Jack Cousteau ⚓️ @angrywaterman.bsky.social · 2d

Seattle No Kings from the monorail

1 5

aria 🍊 @aurelium.me · 2d

finding myself in this picture was like a game of Where's Waldo

Reposted by aria 🍊

🎃scarecrowcialism👻 @swolecialism.bsky.social · 3d

Taking off my balaclava to reveal a second, identical balaclava underneath. Everyone at the fusion intelligence center groans at my shit

2 41 390

aria 🍊 @aurelium.me · 2d

the RevComs in cap hill succeeded at confusing a bunch of people about where the protest was

that's an achievement I guess

aria 🍊 @aurelium.me · 2d

Seattle No Kings was amazing, and now I will collapse unconscious at home

2

aria 🍊 @aurelium.me · 3d

finally invented The VRAM Torturer from the hit novel "The VRAM Torturer is a pretty good quality-of-life feature for your training library"

GitHub pull request titled "The VRAM Torturer".

6

aria 🍊 @aurelium.me · 3d

besides the public safety benefits, there would be a lot of social benefits to putting drunk drivers in prison the first time around. think of all the domestic abuse that could be stopped!

aria 🍊 @aurelium.me · 3d

this video is how I find out a DUI is only a misdemeanor in FL (and many other states)... how the fuck? explains why so many people die from drunk drivers I guess

1

aria 🍊 @aurelium.me · 4d

i do think the officer was named "Lane" or something, bc he says it a lot and it sounds more like "Lane, don't do this to me"

1 1 8