Ananya (ಅನನ್ಯ)
punarpuli.bsky.social
Ananya (ಅನನ್ಯ)
@punarpuli.bsky.social
80 followers 65 following 24 posts
Science & tech journalist, translator. Interested in all things algorithms, oceans, urban & the people involved. https://storiesbyananya.wordpress.com
Posts Media Videos Starter Packs
there are scattered copies of papers all over the internet, the training data of AI models aren't up-to-date.

Until companies improve how they filter retracted papers, users of AI tools should take steps to verify the tool's outputs.

www.technologyreview.com/2025/09/23/1...
AI models are using material from retracted scientific papers
Some companies are working to remedy the issue.
www.technologyreview.com
Groups like @retractionwatch.com maintain a database of retracted papers, and some companies are starting to use these databases now to filter papers, to a certain extent. But, such measures are not foolproof—retraction databases are not comprehensive...
Especially when people use AI tools to seek medical or health information, use of retracted papers can carry grave consequences.

The rate of retractions has been raising over the years, with more than 10,000 retractions in 2023 alone, according to a Nature analysis. bsky.app/profile/rich...
Milestone: 2023 is the first year with more than 10,000 research paper retractions -- smashing previous records. More than 8,000 of these came from Hindawi (mostly from 'special issues'). Total retractions now >50,000. My analysis for Nature.
www.nature.com/articles/d41...
There’s “kind of an agreement that retracted papers have been struck off the record of science and the people who are outside of science—they should be warned that these are retracted papers," Yuanxi Fu told me.

Yet, AI tools continue to use these papers to answer user questions.
Researchers are developing AI tools to generate novel research ideas and papers. But, there's a concern that these tools might just be reusing existing ideas without credit. I dug into the debate for @nature.com
www.nature.com/articles/d41...
What counts as plagiarism? AI-generated papers pose new risks
Researchers argue over whether ‘novel’ AI-generated works use others’ ideas without credit.
www.nature.com
Today's tests cannot provide any meaningful assessment of AI's ability to reason or understand, and yet there are so many claims that AI systems have humanlike cognitive abilities. I report on the current state of evaluation practices for @sciencenews.bsky.social
www.sciencenews.org/article/ai-u...
AI's understanding and reasoning skills can't be assessed by current tests
Assessing whether large language models — including the one that powers ChatGPT — have humanlike cognitive abilities will require better tests.
www.sciencenews.org
Researchers found that OpenAI's speech recognition model, Whisper, fabricated about 1.4% of the audio transcriptions tested, about 40% of which were harmful or concerning in some way. It's hard to spot them without listening to the audio again.
My report for Science: www.science.org/content/arti...
AI transcription tools ‘hallucinate,’ too
Study finds surprisingly harmful fabrications in OpenAI’s speech-to-text algorithm
www.science.org
Thanks, Andrew! I will take a look at the paper
LLM-based multilingual chatbots are becoming common nowadays. But these chatbots might not be the best at answering your healthcare queries, especially if you ask in Hindi, Mandarin and Spanish. My latest for
@sciam.bsky.social exploring the problems.
www.scientificamerican.com/article/chat...
Chatbots Struggle to Answer Medical Questions in Widely Spoken Languages
Two popular chatbots showed some difficulty in providing medical information when asked in Spanish, Hindi or Mandarin
www.scientificamerican.com
...catch everything.
The models and what goes into them, behind them, all of it is often hidden. Deemed proprietary knowledge by companies making these models.
But, w/o access to this information, it's hard to say what problems might exist, where exactly the biases arise and how best to squash them.
Broadly, these models learn to make associations between images and their captions. But the captions themselves can contain incorrect, incomplete, biased, and, as
@abeba.bsky.social and team found, harmful content (which increased as the dataset size increased) . And automated filtering doesn't...
The models often rely on racist and sexist stereotypes to generate images, sometimes even amplifying bias. Stereotypes related to gender, skin colour, occupations, nationalities, geographies and more.
...we've sort of decided as a society that that's not what one should do and putting them in immediately forces a decision about stereotypes, she says. Yet, that's what we can ask these models to do.
One thing that stuck with me from my conversation with Ria is that concepts like poor or kind - words that can't be imaged from a societal perspective - can be fed in as much as any other word, like red or car. It's really harmful to put them into these models because...
Most solutions - writing better prompts, adding sample images - are band aid fixes, and don't address the underlying problems in how these systems are built. There are many ways for these models to be biased that we haven't figured out yet. And there's certainly no way to automate safety.
Recently, many generative AI tools have been released for public use and more and more companies are looking to integrate these tools into their workflows. What makes them popular and what are the concerns? I talked to
@melaniemitchell.bsky.social to know more. sciencenews.org/article/gene...
Generative AI grabbed headlines this year. Here’s why and what’s next
Prominent artificial intelligence researcher Melanie Mitchell explains why generative AI matters and looks ahead to the technology’s future.
sciencenews.org