Pedro Beltrao
banner
pedrobeltrao.bsky.social
Pedro Beltrao
@pedrobeltrao.bsky.social
Associate professor at ETH Zurich, studying the cellular consequences of genetic variation. Affiliated with the Swiss Institute of Bioinformatics and a part of the LOOP Zurich.
True :). More importantly, the models are much better at changing existing images than creating things from scratch. I tried an example where I gave it an actual experimental microscopy image and asked to add 3 cells with chromosome segregation defects and those are even more believable.
November 24, 2025 at 12:48 PM
Generating datasets with some appropriate variation should be harder to spot than crappy microscopy images. One big difference is how easy it is to just ask Gemini to make up some fake images. I didn't try to generate fake omics datasets but that is an interesting thought.
November 22, 2025 at 8:09 PM
I am glad some of these don't look reasonable to experts. It might be possible to use the models to adjust existing images which might be more realistic but hopefully expert reviewers will spot still most fakes.
November 22, 2025 at 8:01 PM
One example I tried illustrates the tension. The model can annotate the cell cycle stage in a given microscopy image (positive value) but it can also add fake cells to the same image showing specific cellular defects (very negative outcome).
November 22, 2025 at 4:48 PM
This is the key point for me as well. It is not the typical scientist it is the paper mills. It seems almost trivial now to generate plausible papers at scale. Even automate the generation of plausible hypotheses that would make the fake papers more believable.
November 22, 2025 at 4:46 PM
This is true of all training data and it is clear these companies are not respecting copyright. I haven't followed this but I think the companies are trying to argue that this is "fair use" since it is not a direct reproduction of the data. This is even worse for writers and artists.
November 22, 2025 at 12:29 PM
I haven't looked too much into it but yes, these companies are claiming to be adding watermarks that can be detected. I didn't look into ways of avoiding detecting. Here is the link to the Google version of this technology deepmind.google/models/synth...
SynthID
SynthID is a tool to watermark and identify AI-generated content, helping to foster transparency and trust in generative AI.
deepmind.google
November 21, 2025 at 8:08 PM
It is impressive. On the positive side, I tried to annotate cell cycle stage directly from images and it seems to do a good job. So the capacity to analyse and "reason" over images is clearly amazing with the obvious downside that the generations are also super realistic.
November 21, 2025 at 6:22 PM
The easy of use now is just incredible. No need to train a specific generator for a specific application, just ask directly by text exactly what you want to see. Conversely, on the positive side it actually means that the power to analyse the images is also there.
November 21, 2025 at 6:20 PM
Here is the actual prompt, I was trying to see how good the visual reasoning is. I presume it will actually be good at quantifying directly from images as well.
November 21, 2025 at 5:32 PM
These image generators add invisible marks to AI generated content. Publishers need to quickly integrate the detectors into their publishing platforms to flag AI generated images.
November 21, 2025 at 5:02 PM
Finally the scary example. The image of a western blot with a time course experiment, staining a protein of interest, showing the increase of protein over time and a second staining with a control antibody that does not change over time
November 21, 2025 at 1:14 PM
The second was about drawing the EGF pathway. I also asked to create an SVG of the EGF pathway drawing which worked but with much lower quality.
November 21, 2025 at 1:14 PM
Reposted by Pedro Beltrao
The update includes metabolomic, genetic, and imaging data, plus much more, all alongside our existing comprehensive dataset

Explore what's new: www.ukbiobank.ac.uk/news/metabol...
Metabolomic, genetic, imaging data and more: new and updated UK Biobank data now available
The latest update to UK Biobank’s comprehensive dataset is available to approved researchers around the world via the Research Analysis Platform (UKB-RAP).
www.ukbiobank.ac.uk
November 20, 2025 at 9:09 AM