Joe Janizek
@joejanizek.bsky.social
physician-scientist, interested in AI safety/interpretability in biology/medicine. jjanizek.github.io
yes, but only because of a final jeopardy back in 2020 that was actually trying to clue a different poem
February 12, 2025 at 3:51 AM
yes, but only because of a final jeopardy back in 2020 that was actually trying to clue a different poem
the different midjourney variations are so interesting. like, the whole row of guys w/ really crazy eyes, not sure where it got that from the prompt, but has a real Ilya Repin -- Ivan the Terrible / Goya -- Saturn vibe
February 12, 2025 at 12:19 AM
the different midjourney variations are so interesting. like, the whole row of guys w/ really crazy eyes, not sure where it got that from the prompt, but has a real Ilya Repin -- Ivan the Terrible / Goya -- Saturn vibe
Think how much performance we might be leaving on the table by not training classifiers on increasingly invasive biometrics. Pictured: medium-term radiologist-AI centaur-configuration possibility
February 11, 2025 at 11:55 PM
Think how much performance we might be leaving on the table by not training classifiers on increasingly invasive biometrics. Pictured: medium-term radiologist-AI centaur-configuration possibility
If this was an AI paper, you’d brand it as an interpretability technique that discovers a latent “node detection” circuit in the neural network
pubs.rsna.org/doi/10.1148/...
pubs.rsna.org/doi/10.1148/...
February 11, 2025 at 11:40 PM
If this was an AI paper, you’d brand it as an interpretability technique that discovers a latent “node detection” circuit in the neural network
pubs.rsna.org/doi/10.1148/...
pubs.rsna.org/doi/10.1148/...
One of the most remarkable parts of the s1 paper, IMO, is how much AI progress drives further AI progress. For s1, authors needed reasoning traces for SFT — these were generated with Gemini. The questions those reasoning traces were generated for needed to be difficult — so they measured ..
February 3, 2025 at 4:12 PM
One of the most remarkable parts of the s1 paper, IMO, is how much AI progress drives further AI progress. For s1, authors needed reasoning traces for SFT — these were generated with Gemini. The questions those reasoning traces were generated for needed to be difficult — so they measured ..
Posting your GitHub contributions is passé — in 2025 I want to see your Anki review calendar
February 1, 2025 at 5:14 PM
Posting your GitHub contributions is passé — in 2025 I want to see your Anki review calendar
Reading about "majority label bias" in prompting/ICL (see arxiv.org/pdf/2102.09690, arxiv.org/pdf/2312.16549). It seems like an interesting behavior, and not clearly "faulty" -- i.e. calibrating the output of the model to the base rate frequency of the label in your prompt?
January 29, 2025 at 7:49 PM
Reading about "majority label bias" in prompting/ICL (see arxiv.org/pdf/2102.09690, arxiv.org/pdf/2312.16549). It seems like an interesting behavior, and not clearly "faulty" -- i.e. calibrating the output of the model to the base rate frequency of the label in your prompt?
the best part of a dedicated research month is getting to spend time reading and running experiments again, but a close runner up is getting to see the light of day in the winter
January 25, 2025 at 11:23 PM
the best part of a dedicated research month is getting to spend time reading and running experiments again, but a close runner up is getting to see the light of day in the winter
I have to get in the best shape of my life in the next 6 months
December 21, 2024 at 2:35 AM
I have to get in the best shape of my life in the next 6 months
Every ICU textbook on mechanical ventilation likes to start with a Vesalius quote, but I have not yet seen one that’s included any of the drawings of the experiments. Pretty gruesome!
December 19, 2024 at 4:15 AM
Every ICU textbook on mechanical ventilation likes to start with a Vesalius quote, but I have not yet seen one that’s included any of the drawings of the experiments. Pretty gruesome!
Used llama 3.1 8b to embed my tweets, then d3.js to make an interactive plot that shows tweets when i scroll over points. It's fun to find "regions" of my tweet-space: (1) the zone of AI interpretability, (2) COVID-posting in 2020, and most importantly (3) "LFGGGGGG"-space
December 13, 2024 at 12:36 PM
Used llama 3.1 8b to embed my tweets, then d3.js to make an interactive plot that shows tweets when i scroll over points. It's fun to find "regions" of my tweet-space: (1) the zone of AI interpretability, (2) COVID-posting in 2020, and most importantly (3) "LFGGGGGG"-space
This paper is so cool — my first reaction is that it makes me wish I was better read in stats. Authors use Candes et al’s Model-X knockoffs framework to control Type I error rate of feature interaction detection pipelines arxiv.org/abs/2408.17016
December 11, 2024 at 9:39 PM
This paper is so cool — my first reaction is that it makes me wish I was better read in stats. Authors use Candes et al’s Model-X knockoffs framework to control Type I error rate of feature interaction detection pipelines arxiv.org/abs/2408.17016
An interesting trade-off: trying to constrain info to pass through bottleneck w/o skip connections while capturing info about fine resolution. I wonder about multi-resolution autoencoders, maybe separate models for different resolutions
www.medrxiv.org/content/10.1...
www.medrxiv.org/content/10.1...
December 10, 2024 at 9:09 PM
An interesting trade-off: trying to constrain info to pass through bottleneck w/o skip connections while capturing info about fine resolution. I wonder about multi-resolution autoencoders, maybe separate models for different resolutions
www.medrxiv.org/content/10.1...
www.medrxiv.org/content/10.1...
One of my favorite “why didn’t I think of that” results in recent months — if your API requests are expensive due to long prompts with many demonstrations, you can just batch a bunch of questions that share the same demos into a single prompt without losing much accuracy arxiv.org/html/2405.09...
December 3, 2024 at 3:45 AM
One of my favorite “why didn’t I think of that” results in recent months — if your API requests are expensive due to long prompts with many demonstrations, you can just batch a bunch of questions that share the same demos into a single prompt without losing much accuracy arxiv.org/html/2405.09...
The moral of the post is to not crush a second energy drink while trying to review a paper that you’re really not enjoying, lest you get distracted and write something else instead
December 1, 2024 at 9:02 PM
The moral of the post is to not crush a second energy drink while trying to review a paper that you’re really not enjoying, lest you get distracted and write something else instead
Overcaffeinated this morning while trying to peer review a medical AI journal paper, and accidentally wrote a blog post on bad baselines, frictionless reproducibility, and accelerating medical AI research with scientific LM programs github.com/jjanizek/blo...
December 1, 2024 at 9:02 PM
Overcaffeinated this morning while trying to peer review a medical AI journal paper, and accidentally wrote a blog post on bad baselines, frictionless reproducibility, and accelerating medical AI research with scientific LM programs github.com/jjanizek/blo...
Preparing to blog “rads style”*
*with a dictaphone
*with a dictaphone
November 29, 2024 at 10:19 PM
Preparing to blog “rads style”*
*with a dictaphone
*with a dictaphone
Attaching a selfie of me smiling agreeably to every reply
November 27, 2024 at 5:08 AM
Attaching a selfie of me smiling agreeably to every reply
Very cool paper — A stylistic thing incidental to the main results of your work, but I like this convention of putting the assumption’s Roman numeral above the equality. Very compact way to get everything in a single line.
November 25, 2024 at 8:20 PM
Very cool paper — A stylistic thing incidental to the main results of your work, but I like this convention of putting the assumption’s Roman numeral above the equality. Very compact way to get everything in a single line.
Variations on a theme
November 24, 2024 at 2:06 AM
Variations on a theme
Haven’t dug into this in depth, but cool to see AISI using LAB-bench for their pre-deployment testing of Sonnet cdn.prod.website-files.com/663bd486c5e4...
November 23, 2024 at 2:00 AM
Haven’t dug into this in depth, but cool to see AISI using LAB-bench for their pre-deployment testing of Sonnet cdn.prod.website-files.com/663bd486c5e4...
A patient this morning asked me if I could get him all of the electrolyte lab value data from all of the other patients on the floor,
"anonymized, of course, if necessary,” so that he could build a database to study the “natural variation in ratios of electrolytes.”
"anonymized, of course, if necessary,” so that he could build a database to study the “natural variation in ratios of electrolytes.”
November 22, 2024 at 2:15 PM
A patient this morning asked me if I could get him all of the electrolyte lab value data from all of the other patients on the floor,
"anonymized, of course, if necessary,” so that he could build a database to study the “natural variation in ratios of electrolytes.”
"anonymized, of course, if necessary,” so that he could build a database to study the “natural variation in ratios of electrolytes.”
I’ve just seen some shocking data
November 22, 2024 at 2:13 AM
I’ve just seen some shocking data
After a mildly frightening-looking spring-loaded applicator for the needle, second pic is basically how I look now
November 19, 2024 at 8:44 PM
After a mildly frightening-looking spring-loaded applicator for the needle, second pic is basically how I look now