They propose PoPE (Polar Coordinate Position Embeddings), which eliminates the what-where.
They propose PoPE (Polar Coordinate Position Embeddings), which eliminates the what-where.
Connectionism will cover topics as varied as their research is: from kernel numerics to prompt engineering. Here, they share what they are working on.
- research the best way to do x
- decide your system is "special" and reject it
- fight through days of debugging and refactoring
- "independently" arrive at research conclusion
- research the best way to do x
- decide your system is "special" and reject it
- fight through days of debugging and refactoring
- "independently" arrive at research conclusion
BlueSky: *crickets*
r/MachineLearning: mostly constructive feedback
HackerNews: this guy sucks
TechRxiv: we’re backed by IEEE, and ghost people who ask for status updates after we miss our self-imposed deadlines.
Glad I burned billable hours to work on this
BlueSky: *crickets*
r/MachineLearning: mostly constructive feedback
HackerNews: this guy sucks
TechRxiv: we’re backed by IEEE, and ghost people who ask for status updates after we miss our self-imposed deadlines.
Glad I burned billable hours to work on this
I fixed neural arithmetic. Division works. Extrapolation works. 10^-16 error.
Turns out: training distribution matters, complex numbers > log space, and those NALU weights were calculable all along.
Proof: hillspace.justindujardin.com
#MachineLearning
I fixed neural arithmetic. Division works. Extrapolation works. 10^-16 error.
Turns out: training distribution matters, complex numbers > log space, and those NALU weights were calculable all along.
Proof: hillspace.justindujardin.com
#MachineLearning
Main Link | Techmeme Permalink
Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦
Excited to share my Google DeepMind internship results, which reveal the fascinating dynamics behind factual knowledge acquisition in LLMs!
Excited to share my Google DeepMind internship results, which reveal the fascinating dynamics behind factual knowledge acquisition in LLMs!