Prolific
@joinprolific.bsky.social
270 followers 1.4K following 160 posts
The ultimate human data platform to power world-changing AI and research. 🔗 www.prolific.com
Posts Media Videos Starter Packs
Pinned
joinprolific.bsky.social
🔍 Audience Finder is here.

Instantly check if the participants you need for your research are on Prolific. No set-up required.

Just type who you’re looking for (“Bilingual psychologists who speak Mandarin”) and see live results in seconds.

➡️ www.prolific.com/audience-fin... | #AcademicSky
joinprolific.bsky.social
AI benchmarks show models like GPT-5 might ace scientific reasoning, or Gemini and Claude adapt to new concepts. But which LLMs offer the best user experience?

@tomsguide.com explains how Prolific's AI leaderboard HUMAINE is a big step forward.

#ArtificialIntelligence #MachineLearning #LLM
27 AI models were ranked by the public and ChatGPT came 8th — these are the models that beat it
A surprising set of results for the big AI chatbots
www.tomsguide.com
Reposted by Prolific
andrewgordon.bsky.social
Super excited to share this work with the world!

We wanted to put human preference at the heart of our AI model leaderboard, and to do so in a principled and representative manner.

The result: HUMAINE

Check out the data here and get in touch with any questions: huggingface.co/spaces/Proli...
joinprolific.bsky.social
Probability of Practical Superiority (PPS) Comparison Matrix is available under "Model Comparison" for estimating how likely one model is likely to outperform another in real-world usage.

Values >55% suggest one model meaningfully outperforms beyond random chance.
joinprolific.bsky.social
HUMAINE uses rigorous methodology to ensure representative and reliable results.

Our methodology includes:

- Comparative assessment
- Statistically rigorous hierarchical modeling
- Four core dimensions of user experience and overall winner
- Commitment to demographic representation
joinprolific.bsky.social
🏆 Gemini-2.5-Pro takes the #1 spot decisively, with DeepSeek-V3 emerging as the surprise runner-up.

⚖️ Age groups show highest disagreement on model rankings.

📊 27 models tested across 100K+ human comparisons - one of the largest public AI evaluations to date.
joinprolific.bsky.social
Our human-centered evaluation and leaderboard shows how people actually experience AI models in the real world.

Rich, multi-dimensional feedback from a diverse, representative sample of real users from our pool - revealing not just which model they prefer, but why.
joinprolific.bsky.social
Introducing HUMAINE: the LLM benchmark that puts real human experience first 🎯

21,352 human evaluators. 27 models. 22 demographic groups. 5 evaluation dimensions.

In partnership with @hf.co. See insights: huggingface.co/spaces/Proli...

More in 🧵 #ArtificialIntelligence #MLSky #LLM #Developer
Visual of Prolific's human-centered leaderboard for Large Language Models (LLMs). There are four visible bar graphs with the labels of gemini-2.5-pro, deepseek-chat-v3-0324, magistral-medium-2506, and grok-4. The bars appear on the branded Prolific blue, pink, and white blackground. The title reads "The benchmark with human experience at the center."
joinprolific.bsky.social
We've also launched new features like regionally-stratified representative samples for the US and UK, upgraded AI-powered audience search, and more.

See everything that's new here: www.prolific.com/resources/wh...
What's New at Prolific: regional rep samples and more | Prolific
Discover Prolific's September 2025 updates: regional rep samples, new skilled participants, and more
www.prolific.com
joinprolific.bsky.social
New: Periodic identity confirmation checks with video recording to ensure authentic, human feedback in your academic and industry research.

Along with bank-grade ID verification, this ensures participants are who they say they are—not bots, duplicates, or fraud.

#AcademicSky #ResearchIntegrity
joinprolific.bsky.social
AI/ML researchers in San Francisco, we have a few spaces left at our community event 🇺🇸

Join us Sep 11 for this month’s theme: Researchers building real-world AI tooling. With speakers Prolific CEO Phelim Bradley and Jiaxin Pei, Postdoc at @stanfordhai.bsky.social. 👇

#MLSky #ArtificialIntelligence
Prolific Meetup #1 – Researchers Building Real-World AI Tooling · Luma
About the Event Prolific is bringing the AI community together in San Francisco to explore the evolving role of humans in post-training, alignment, evaluation,…
luma.com
joinprolific.bsky.social
🔍 Audience Finder is here.

Instantly check if the participants you need for your research are on Prolific. No set-up required.

Just type who you’re looking for (“Bilingual psychologists who speak Mandarin”) and see live results in seconds.

➡️ www.prolific.com/audience-fin... | #AcademicSky
joinprolific.bsky.social
👉 Model-based screening during participant onboarding through Protocol, our proprietary data protection system, where we run internal models to flag LLM-generated responses with high confidence.

More information and tips in this transparent session: youtu.be/MBo50M6etCk
Keeping Research Real in the Age of AI: LLM Detection, Data Quality, and Fraud Prevention | Prolific
YouTube video by Prolific
youtu.be
joinprolific.bsky.social
👉 Bi-weekly data quality audits where a specialized team of human reviewers benchmark data quality across honesty, transparency, verbosity, and attention.

👉 Authenticity checks, our own tool that detects LLM-generated responses using advanced behavioral analysis—with 98.7% accuracy.
joinprolific.bsky.social
In the age of AI, authenticity matters more than ever in academic research. LLM-generated responses, false demographics, and more pose a threat.

VP of Product Sara details tools and internal measures we’ve implemented to help maintain research integrity 🧵

#AcademicSky #ResearchIntegrity
joinprolific.bsky.social
Absolutely. We continuously adapt our systems to ensure high quality data. Our LLM detection tool uses advanced behavioral analysis (98.7% accuracy), our data protection system also catches LLM use at onboarding stage, and more. Please do report low-quality IDs in-app and get in touch with feedback.
joinprolific.bsky.social
Interesting findings around how agents compete with humans for partnerships by @yaominj.bsky.social, @levinbrinkmann.bsky.social, Anne-Marie Nussberger, Ivan Soraperra, @jfbonnefon.bsky.social, @iyadrahwan.bsky.social, @mpib-berlin.bsky.social.

Taskers sourced via Prolific.

#AI #MLSky #AcademicSky
joinprolific.bsky.social
"Through three experiments (N = 975), we found that bots, though more prosocial than humans and linguistically distinguishable, were not selected preferentially when their identity was hidden. Instead, humans misattributed bots’ behavior to humans and vice versa."

arxiv.org/abs/2507.13524
Reposted by Prolific
New preprint -- we (me, @klempert.bsky.social, Dave Wolk, and Joe Kable) examined whether characteristic aging-related changes in cognition and personality show up on six online platforms (3 crowdsourcing sites -- MTurk, CloudResearch Toolkit, & Prolific, and 3 panels). osf.io/preprints/ps... 1/
OSF
osf.io
Reposted by Prolific
andrewgordon.bsky.social
Sharing a piece I wrote for The AI Journal on who should govern AI: aijourn.com/a-fine-balan...

Our recent @joinprolific.bsky.social polling shows 69.7% of people think AI investment will primarily benefit corporations not the public. They're probably right. [1/3]

#AI #AIGovernance #TechPolicy
joinprolific.bsky.social
Great analysis and always a pleasure to support.
joinprolific.bsky.social
Original thread: bsky.app/profile/kobi...

Congratulations Kobi and team. Pleased to have supported this project.
kobihackenburg.bsky.social
Today (w/ @ox.ac.uk @stanford @MIT @LSE) we’re sharing the results of the largest AI persuasion experiments to date: 76k participants, 19  LLMs, 707 political issues.

We examine “levers” of AI persuasion: model scale, post-training, prompting, personalization, & more! 

🧵:
joinprolific.bsky.social
The largest investigation of AI persuasion with 76,977 participants across 3 large-scale experiments. Excellent work by @ox.ac.uk PhD candidate @kobihackenburg.bsky.social.

19 LLMs. 707 political issues. 466,769 fact-checkable claims evaluated.

arxiv.org/abs/2507.13919

#AcademicSky #MLSky #PhDSky
A still image of the PhD candidate Kobi Hackenburg's AI persuasion research paper titled "The Levers of Political Persuasion with Conversational AI."
joinprolific.bsky.social
Sorry to hear this Jim, we're keen to investigate this jump. You can report it quickly within the platform ("Action" tab), but alternatively could you submit a data quality report ASAP if not already so we can investigate the IDs more urgently? Thank you! forms.prolific.com/to/Zv6w7ZWt
joinprolific.bsky.social
4) Report data quality concerns in app

If you do experience data quality issues, we want to investigate quickly. To easily report it, click the "Action" dropdown on the submissions page to flag any concerns.

Full list of updates: www.prolific.com/resources/wh...