Daniel Khashabi
banner
danielkhashabi.bsky.social
Daniel Khashabi
@danielkhashabi.bsky.social
I play with intuitions and data.

Now: @jhuclsp @jhucompsci
Past: @allen_ai @uwnlp @Penn @cogcomp @Illinois_Alma @MSFTResearch
We show that GP provably identifies a target among N documents in O(log N) rounds, ensuring scalability to many-document settings.

More in the paper: arxiv.org/pdf/2510.09770
February 11, 2026 at 10:48 PM
Specifically, it searches over long contexts by (i) reordering documents to concentrate high-belief items in highly “diagnostic” positions, and (ii) updating beliefs about document relevance from model outputs.
February 11, 2026 at 10:48 PM
We introduce ⭐𝐆𝐨𝐥𝐝-𝐏𝐚𝐧𝐧𝐢𝐧𝐠⭐, a black-box Bayesian framework that, at inference time, strategically and iteratively shuffles documents to overcome positional bias.
February 11, 2026 at 10:48 PM
LLMs continue to struggle with long-context tasks—such as needle-in-a-haystack problems—because of “positional bias.” What can we do if we only have 𝘣𝘭𝘢𝘤𝘬-𝘣𝘰𝘹 access to the model? (i.e., we can’t modify the model weights or attention patterns, as is often the case with API models.)
February 11, 2026 at 10:48 PM
* On CARDBiomedBench, outperforming top frontier models
* Lives inside a fully open platform, ready for experimentation, benchmarking, and real-world science

🧪 Read the full blog: lnkd.in/emJjTAue
🔍 Try it today on: biomedarena.ai
BiomedArena.AI - Transparent AI Model Evaluation Platform
Compare and evaluate leading AI models side-by-side through community voting.
biomedarena.ai
January 22, 2026 at 5:00 AM
We have been busy building our science co-pilot for Genomics AI Agent at @DataTecnica which is specialized in Alzheimer’s and neurodegenerative disease research.

This system:
* Synthesizes complex biomedical data across literature and genomics databases
January 22, 2026 at 5:00 AM
Postdoc positions:
ai.jhu.edu/careers/pos...

Applications are due January 23, 2026.

Positions are for 2 years with the possibility of an extension.
Postdoctoral Fellowship Program - Johns Hopkins Data Science and AI Institute
The Johns Hopkins Data Science and AI (DSAI) Institute welcomes applications for its postdoctoral fellowship program, seeking disciplinarily diverse scholars to advance foundational methods of data science and artificial intelligence,…
ai.jhu.edu
January 21, 2026 at 5:00 AM
Overdue update — CARDBiomedBench will be featured in @LancetDigitalH! 🎉

If you're looking for a high-quality and challenging science benchmark for your AI model, this could be it!

🤗 Dataset: huggingface.co/datasets/NI...
📄 Paper: biorxiv.org/content/10....
x.com/DanielKhash...
CARDBiomedBench: A Benchmark for Evaluating Large Language Model Performance in Biomedical Research
Backgrounds Biomedical research requires sophisticated understanding and reasoning across multiple specializations. While large language models (LLMs) show promise in scientific applications, their capability to safely and accurately support complex biomedical research remains uncertain. Methods We present CARDBiomedBench , a novel question-and-answer benchmark for evaluating LLMs in biomedical research. For our pilot implementation, we focus on neurodegenerative diseases (NDDs), a domain requiring integration of genetic, molecular, and clinical knowledge. The benchmark combines expert-annotated question-answer (Q/A) pairs with semi-automated data augmentation, drawing from authoritative public resources including drug development data, genome-wide association studies (GWAS), and Summary-data based Mendelian Randomization (SMR) analyses. We evaluated seven private and open-source LLMs across ten biological categories and nine reasoning skills, using novel metrics to assess both respon
www.biorxiv.org
January 20, 2026 at 9:45 PM
Postdoc positions:
ai.jhu.edu/careers/pos...

Applications are due January 23, 2026.

Positions are for 2 years with the possibility of an extension.
Postdoctoral Fellowship Program - Johns Hopkins Data Science and AI Institute
The Johns Hopkins Data Science and AI (DSAI) Institute welcomes applications for its postdoctoral fellowship program, seeking disciplinarily diverse scholars to advance foundational methods of data science and artificial intelligence,…
ai.jhu.edu
January 16, 2026 at 12:15 PM
Postdoc positions:
ai.jhu.edu/careers/pos...

Applications are due January 23, 2026.

Positions are for 2 years with the possibility of an extension.
Postdoctoral Fellowship Program - Johns Hopkins Data Science and AI Institute
The Johns Hopkins Data Science and AI (DSAI) Institute welcomes applications for its postdoctoral fellowship program, seeking disciplinarily diverse scholars to advance foundational methods of data science and artificial intelligence,…
ai.jhu.edu
January 7, 2026 at 9:15 PM
Postdoc positions:
ai.jhu.edu/careers/pos...

Applications are due January 23, 2026.

Positions are for 2 years with the possibility of an extension.
Postdoctoral Fellowship Program - Johns Hopkins Data Science and AI Institute
The Johns Hopkins Data Science and AI (DSAI) Institute welcomes applications for its postdoctoral fellowship program, seeking disciplinarily diverse scholars to advance foundational methods of data science and artificial intelligence,…
ai.jhu.edu
January 2, 2026 at 2:00 PM
We're extremely thankful to the Evo2 team ( @BrianHie @pdhsu @garykbrixi @mgdurrant @MichaelPoli6 etc.). Not only these models help advance biomed research, now we see that they can help AI community better understand the fundamentals of pre-training.
November 18, 2025 at 5:27 PM
Draft: huggingface.co/papers/2511...

Huge thanks to @N8Programs for leading the work, and to collaborators @anqi_liu33 @aamixsh @mrevsine @mike_schatz.
Paper page - Genomic Next-Token Predictors are In-Context Learners
huggingface.co
November 18, 2025 at 5:27 PM
𝗗𝗼𝗲𝘀 𝘁𝗵𝗶𝘀 𝗺𝗲𝗮𝗻 𝗵𝘂𝗺𝗮𝗻 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 𝗶𝘀 𝗶𝗿𝗿𝗲𝗹𝗲𝘃𝗮𝗻𝘁? No! But it suggests there may be universal distributional properties across different languages (human, DNA, etc.) that yield ICL. It remains an open question what these properties are.
November 18, 2025 at 5:27 PM
𝗗𝗼𝗲𝘀 𝗜𝗖𝗟 𝗶𝗻 𝗴𝗲𝗻𝗼𝗺𝗶𝗰 𝘃𝘀 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗺𝗼𝗱𝗲𝗹𝘀 𝗮𝗰𝘁 𝗶𝗱𝗲𝗻𝘁𝗶𝗰𝗮𝗹𝗹𝘆? No! While share macro-level ICL trends, each shows domain-specific inductive biases traceable to properties of DNA vs human language.
November 18, 2025 at 5:27 PM
𝗪𝗵𝘆 𝗶𝘁 𝗺𝗮𝘁𝘁𝗲𝗿𝘀: To our knowledge, this is the first evidence of emergent ICL in non-[human]language symbolic sequences. It suggests that ICL is modality-agnostic, and a general consequence of large-scale autoregressive training on rich data distributions.
November 18, 2025 at 5:27 PM
This lets us compare Evo2 (genomic) vs Qwen3 (language) under matched few-shot prompts.
November 18, 2025 at 5:27 PM
𝗛𝗼𝘄 𝗱𝗶𝗱 𝘄𝗲 𝗰𝗼𝗺𝗽𝗮𝗿𝗲 𝗴𝗲𝗻𝗼𝗺𝗶𝗰 𝘃𝘀 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗺𝗼𝗱𝗲𝗹𝘀? We built a suite of symbolic bitstring-reasoning tasks and encoded them two ways: (1) genomic alphabet (A/T/C/G) and (2) linguistic alphabet (digits).
November 18, 2025 at 5:27 PM
→ similar log-linear gains with more shots
→ similar improvement with model scale
... all learned purely from DNA (nucleotide) sequences.
November 18, 2025 at 5:27 PM
Thrilled to share our latest result: 𝗚𝗲𝗻𝗼𝗺𝗶𝗰🧬 𝗺𝗼𝗱𝗲𝗹𝘀 𝘁𝗿𝗮𝗶𝗻𝗲𝗱 𝙤𝙣𝙡𝙮 𝗼𝗻 '𝗻𝗲𝘅𝘁-𝗻𝘂𝗰𝗹𝗲𝗼𝘁𝗶𝗱𝗲 𝗽𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻' 𝗲𝘅𝗵𝗶𝗯𝗶𝘁 𝗜𝗖𝗟!

What's remarkable is that their overall pattern closely mirrors LLMs:
→ similar few-shot pattern induction
November 18, 2025 at 5:27 PM
For years since the GPT-2 paper, emergent in-context learning (ICL) from 'next-token' training has been treated as something deeply tied to 𝐡𝐮𝐦𝐚𝐧 𝐥𝐚𝐧𝐠𝐮𝐚𝐠𝐞. But … is it?
November 18, 2025 at 5:27 PM
Big congrats to @jackjingyuzhang for being named an Amazon AI PhD Fellow! 🎉 Grateful for @AmazonScience @RohitPrasadAI’s support as we work together to advance AI research at JHU.
x.com/jackjingyuz...
October 24, 2025 at 4:08 PM
𝗦𝗲𝗲 𝘁𝗵𝗲 𝗱𝗲𝘁𝗮𝗶𝗹𝘀 𝗼𝗳 𝘁𝗵𝗲 𝗳𝗶𝗻𝗱𝗶𝗻𝗴𝘀: huggingface.co/papers/2509...

Work lead by @aamixsh and in collaboration with @anqi_liu33.
@HopkinsEngineer @JHUCompSci

x.com/aamixsh/sta...
Paper page - IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning
huggingface.co
October 3, 2025 at 2:23 PM
For 2️⃣, we introduce 𝑨𝒄𝒕𝒊𝒗𝒂𝒕𝒊𝒐𝒏 𝑨𝒍𝒊𝒈𝒏𝒎𝒆𝒏𝒕 (𝑰𝑨𝟐) -- a method that 𝘥𝘪𝘴𝘵𝘪𝘭𝘭𝘴 𝘐𝘊𝘓 𝘢𝘤𝘵𝘪𝘷𝘢𝘵𝘪𝘰𝘯𝘴 𝘪𝘯𝘵𝘰 𝘵𝘩𝘦 𝘱𝘢𝘳𝘢𝘮𝘦𝘵𝘦𝘳𝘴 𝘰𝘧 𝘢 𝘱𝘳𝘦-𝘵𝘳𝘢𝘪𝘯𝘦𝘥 𝘮𝘰𝘥𝘦𝘭. Then, running SFT on top of this "primed" model leads to consistent gains over vanilla SFT.
October 3, 2025 at 2:23 PM
On 1️⃣, building on prior findings, we find that ICL and SFT trigger distinct ⚡activation⚡ patterns -- an additional signal that ICL and SFT operate differently. We also find that ICL is generally more calibrated than SFT, though sometimes at the cost of accuracy.
October 3, 2025 at 2:23 PM