tkukurin.github.io
neuro has nice terms "working memory gating", "attentional selection"
but I think the point of keeping "learning" in there is exactly because the _outcome_ is you get better performance on your task at hand.
neuro has nice terms "working memory gating", "attentional selection"
but I think the point of keeping "learning" in there is exactly because the _outcome_ is you get better performance on your task at hand.
on the former - it's costly if one assumes grok invocation per page load. but IMO scaling a grok-twitter-recsys is amenable to good system dimensioning (e.g. model cascade + periodic batching)
from that perspective, doesn't seem preposterous 🤷
what you'd normally do for a closed-platform is (1) scrape posts, (2) write-your-own ranking/clf. algo to get better content (tags are useful but easily co-opted, the platform's recsys is necessarily deficient due to partial obs).
_native_ support for scaling (2) I find exciting
on the former - it's costly if one assumes grok invocation per page load. but IMO scaling a grok-twitter-recsys is amenable to good system dimensioning (e.g. model cascade + periodic batching)
from that perspective, doesn't seem preposterous 🤷
IMO the domain model is intelligence as part user (akin to "phds doing lin reg" in finance), part interface (RLHF), part model
IMO the domain model is intelligence as part user (akin to "phds doing lin reg" in finance), part interface (RLHF), part model
becomes relevant in some intricate global-recommender-system situation but google already has years of dealing with this behind them.
becomes relevant in some intricate global-recommender-system situation but google already has years of dealing with this behind them.
(eg via diff-in-diffs perf on coding tasks nodejs vs swift)
(I assume gem was predominantly google3-trained based on its coding quirks)
(eg via diff-in-diffs perf on coding tasks nodejs vs swift)
(I assume gem was predominantly google3-trained based on its coding quirks)
there's a nice exposition of this topic in @rockt.ai and others' position paper arxiv.org/abs/2406.042...
which, to prove a meta-point, delivers a rather "obvious" message with exposition clarity deemed novel enough for ICML poster acceptance :)
there's a nice exposition of this topic in @rockt.ai and others' position paper arxiv.org/abs/2406.042...
which, to prove a meta-point, delivers a rather "obvious" message with exposition clarity deemed novel enough for ICML poster acceptance :)
RAG implies external memory; reasoning post-trained model generates more artefacts (/tokens/"writes") as a result of computation (which is also where "dynamic" makes a difference)
RAG implies external memory; reasoning post-trained model generates more artefacts (/tokens/"writes") as a result of computation (which is also where "dynamic" makes a difference)
huggingface.co/spaces/Huggi...
huggingface.co/spaces/Huggi...
it would be intriguing to see the progress if you're willing to share at some point!
it would be intriguing to see the progress if you're willing to share at some point!
> Every app has a different design [optimized based on your activity ...] each app trying to make you do different things in uniquely annoying ways [...] low-quality clickbait
anyhow we all know how it goes
www.wheresyoured.at/never-forgiv...
> Every app has a different design [optimized based on your activity ...] each app trying to make you do different things in uniquely annoying ways [...] low-quality clickbait
anyhow we all know how it goes
all of art ultimately exists on a particular grounded low entropy axis (nice example by Kurt Vonnegut youtu.be/4_RUgnC1lm8?...)
all of art ultimately exists on a particular grounded low entropy axis (nice example by Kurt Vonnegut youtu.be/4_RUgnC1lm8?...)
what you'd normally do for a closed-platform is (1) scrape posts, (2) write-your-own ranking/clf. algo to get better content (tags are useful but easily co-opted, the platform's recsys is necessarily deficient due to partial obs).
_native_ support for scaling (2) I find exciting
what you'd normally do for a closed-platform is (1) scrape posts, (2) write-your-own ranking/clf. algo to get better content (tags are useful but easily co-opted, the platform's recsys is necessarily deficient due to partial obs).
_native_ support for scaling (2) I find exciting
(in which sense, I guess a similar analogy exists: internal monologue vs. writing)
(in which sense, I guess a similar analogy exists: internal monologue vs. writing)
> Control energy for state transitions decreases over the course of repeated task trials
> Control energy for state transitions decreases over the course of repeated task trials