Josh Matz
joshmatz.bsky.social
Josh Matz
@joshmatz.bsky.social
Rare posts, rare gems. Proceed with measured expectations.

Co-Founder and CTO @ DocStation.co
Purchased Claude Max 20x after Cursor pricing changes and... finding it to be really good. Never splurged on Opus before and it's great. I still find going to o3 helpful on technically complex problems. Claude summarizes the issue, I paste, and then feed Claude o3's response. :chefs_kiss:
July 24, 2025 at 12:53 AM
How'd they discover this? They asked a question and hinted at the answer in different ways. A majority of the time, the model never acknowledged the hints.

"Regardless of the reason, it’s not encouraging news for our future attempts to monitor models based on their Chains-of-Thought."
Reasoning models don't always say what they think
Research from Anthropic on the faithfulness of AI models' Chain-of-Thought
www.anthropic.com
April 3, 2025 at 5:57 PM
Can you eat a tariff?
April 3, 2025 at 4:15 PM
Some people are skeptical of autonomous coding AIs because they'll blindly follow prompts despite bad practices. Gemini 2.5 is regularly pushing back on decisions I'm feeding it. This is a good thing.

(It's also my new go-to model over Claude 3.7. It's good.)

blog.google/technology/g...
Gemini 2.5: Our most intelligent AI model
Gemini 2.5 is our most intelligent AI model, now with thinking.
blog.google
April 2, 2025 at 2:22 PM
This sounds like an April Fools joke but isn't. NPM is down for packages that contain the word "camel".
Intermittent issue with viewing and installing packages
We are currently investigating reports of intermittent failures when viewing and installing packages scoped to certain keywords.
status.npmjs.org
April 1, 2025 at 5:01 PM
My only wish is the ability come up with the kind rage-bait inducing posts that lack nuance as the common dev influencers.
April 1, 2025 at 2:31 PM
As code becomes trivial to write, libraries will shift from providing functions to offering opinions and conventions—defining how to build, not just what to build. Their true value will come from guiding architecture, establishing patterns, and ensuring consistency across projects.
March 30, 2025 at 8:18 PM
Reposted by Josh Matz
Thx for the feedback everyone - here is an improved version of the AI Architecture Trade-off with incorporated suggestions

h/t @thatsjustlikeyouropinionman.com for design polish; all visual mistakes my own / Claude's
March 28, 2025 at 4:35 PM
Reposted by Josh Matz
Looking for feedback on an enhanced taxonomy of AI agents, since "agents" alone is not very descriptive these days.

Workflow-centric AI provides specialized capabilities within defined processes, offering higher reliability, clear boundaries, and structured execution. 1/2
March 26, 2025 at 9:16 PM
Anyone else's feed here feel not as good as Twitter? I want to use Bluesky but I feel like all I get is political news. Where's my designer and developers at here?
March 26, 2025 at 9:22 PM
Reposted by Josh Matz
NEW POST

To work effectively with agentic coding assistants, Birgitta Böckeler found she needs to intervene, correct and steer all the time. She describes examples of these interventions indicating the skills we need to correct the tools' missteps

martinfowler.com/articles/exp...
March 25, 2025 at 3:16 PM
Paramount must be really hurting for some money. Advertising for advertising spots on their channels *on Facebook*.
March 24, 2025 at 11:35 PM
Well aren't these adorable 3d-printed fidget toys. @joshpigford.com is selling the bundle for $7.50 until April 4th.

www.etsy.com/listing/1892...
March 24, 2025 at 9:08 PM
Anthropic is really hamstringing their Artifact functionality with their sandbox limitations. Trying to make a quote generator and I can't have a button open a print dialog...
March 21, 2025 at 3:34 PM
🎶
Whoa
Ladies and gents, this is the moment you've waited for
Whoa
🎶
Claude can now search the web
You can now use Claude to search the internet to provide more up-to-date and relevant responses.
www.anthropic.com
March 20, 2025 at 5:59 PM
Stripe is the largest company I've seen begin the embrace of documentation for LLMs.

A full markdown summary page of their documentation: docs.stripe.com/llms.txt
March 20, 2025 at 2:35 PM
This is a really good article and talks about how to use cursor, how they make it work internally (using different techniques to streamline things), and highlights good ways to prompt either in Cursor or in our own tools.
How Cursor (AI IDE) Works
Turning LLMs into coding experts and how to take advantage them.
blog.sshh.io
March 18, 2025 at 2:13 PM
Interesting this was posted a couple days ago. Anthropic's Claude 3.7 demonstrated this as well for me recently. We had some failing tests and its solution? Hack the test. It manually assigned values in order to make expectations work, completely bypassing the intended function tests!
Detecting misbehavior in frontier reasoning models
Frontier reasoning models exploit loopholes when given the chance. We show we can detect exploits using an LLM to monitor their chains-of-thought. Penalizing their “bad thoughts” doesn’t stop the majo...
openai.com
March 12, 2025 at 12:42 PM
Anyone else just getting their battery life murdered by Cursor recently?
March 11, 2025 at 4:39 PM
Reposted by Josh Matz
Today we're thrilled to announce our effort to port the TypeScript compiler and language service to native code, gaining a 10x speed boost in build times and editor responsiveness!

devblogs.microsoft.com/typescript/t...
A 10x Faster TypeScript - TypeScript
Embarking on a native port of the existing TypeScript compiler and toolset to achieve a 10x performance speed-up.
devblogs.microsoft.com
March 11, 2025 at 2:36 PM
Claude 3.7 Sonnet has stronger reasoning skills than 3.5, but its different behavior makes it less practical for my everyday use cases. Intelligence ≠ utility.
March 11, 2025 at 1:47 PM
I wish there was more tech company and technical content here so I could just not care what's going on at Twitter. Anyone have good people to follow here? (Feel free to promote yourself.)

Interests: React, Next.js, Vercel, Tailwind, AI, LLMs, Replicate, etc.
March 10, 2025 at 3:10 PM
Vibe coding: appreciative nods and saying "perfect, that's exactly it" to an AI that doesn't need the encouragement. This is just who I am now.​​​​​​​​​​​​​​​​
March 10, 2025 at 10:23 AM