1. Memorization or
2. Priming or
2. Confirmation prompting
www.anthropic.com/research/ali...
Amateur move guys.
Amateur move guys.
www.dwarkesh.com/p/ilya-sutsk...
www.dwarkesh.com/p/ilya-sutsk...
(screenshots not chronological)
www.dwarkesh.com/p/ilya-sutsk...
(screenshots not chronological)
www.dwarkesh.com/p/ilya-sutsk...
> This supports our assertion that the ceiling on LLM creativity (0.25) corresponds to the boundary between little-c and Pro-c human creative performance (Figure 6).
www.academia.edu/144621465/_T...
> This supports our assertion that the ceiling on LLM creativity (0.25) corresponds to the boundary between little-c and Pro-c human creative performance (Figure 6).
www.academia.edu/144621465/_T...
A PhD sitting down and just fabricating >50% of sources = career ending
arxiv.org/abs/2511.11597
It would be amusing to speak to them again.
Not the model, not the prompt - still the human.
The amount of shilling these guys do, no wonder they can’t get anything serious built.
cdn.openai.com/pdf/4a25f921...
Not the model, not the prompt - still the human.
The amount of shilling these guys do, no wonder they can’t get anything serious built.
cdn.openai.com/pdf/4a25f921...
Unfortunately, this is a societal failure. Tech didn’t invent loneliness, it offered a new way to cope with it - in an empathetic echo chamber.
We are failing the kids. Others too, but mostly it’s the kids that I worry about.
www.nytimes.com/2025/11/17/o...
Unfortunately, this is a societal failure. Tech didn’t invent loneliness, it offered a new way to cope with it - in an empathetic echo chamber.
We are failing the kids. Others too, but mostly it’s the kids that I worry about.
en.wikipedia.org/wiki/Ramamur...
en.wikipedia.org/wiki/Ramamur...
But next quarter you should be terrified.
But next quarter you should be terrified.
storage.googleapis.com/deepmind-med...
storage.googleapis.com/deepmind-med...
Actually their realization dawned a few weeks back, but these things take a little while to surface externally.
Image of tweet from bird site because I won’t link to it.
Actually their realization dawned a few weeks back, but these things take a little while to surface externally.
Image of tweet from bird site because I won’t link to it.
- Report is based on Claude’s logs without any visibility to human actions outside of Claude
- Reinforcing “80–90% of tactical work” was by Claude and humans were merely in a strategic role, is curiously well aligned with their marketing message rather than any verified capability
- Report is based on Claude’s logs without any visibility to human actions outside of Claude
- Reinforcing “80–90% of tactical work” was by Claude and humans were merely in a strategic role, is curiously well aligned with their marketing message rather than any verified capability
When Cursor added agentic coding in 2024, adopters produced 39% more code merges, with no sign of a decrease in quality (revert rates were the same, bugs dropped) and no sign that the scope of the work shrank. papers.ssrn.com/sol3/papers....
> We assess how effectively large language models generate social media replies that remain indistinguishable from human-authored content when evaluated by automated classifiers. We employ a BERT-based binary classification model to distinguish between the two text types.
But... can they? We don’t actually know.
In our new study, we develop a Computational Turing Test.
And our findings are striking:
LLMs may be far less human-like than we think.🧵
> We assess how effectively large language models generate social media replies that remain indistinguishable from human-authored content when evaluated by automated classifiers. We employ a BERT-based binary classification model to distinguish between the two text types.
I would be shocked if OpenAI hasn’t / isn’t already indexing the web even as I type this.
I would be shocked if OpenAI hasn’t / isn’t already indexing the web even as I type this.