Conversational & Generative AI
Live Media
I'll start.
One of my glorious instructions was "Make it a short, but detailed explanation". LLM tried to obey. A human just give me a confused look.
I'll start.
One of my glorious instructions was "Make it a short, but detailed explanation". LLM tried to obey. A human just give me a confused look.
This won't replace your evals (that's a whole other topicI'll cover later). But while evals tell you if your prompt failed (and how often), this tells you why you're about to waste a week building test cases for instructions that don't even make sense to humans.
This won't replace your evals (that's a whole other topicI'll cover later). But while evals tell you if your prompt failed (and how often), this tells you why you're about to waste a week building test cases for instructions that don't even make sense to humans.
This isn't debugging prompts - it's debugging the lies we tell ourselves about clarity. Every prompt is like a Rorschach test of your comm skills.
This isn't debugging prompts - it's debugging the lies we tell ourselves about clarity. Every prompt is like a Rorschach test of your comm skills.
Now for the uncomfortable part.
- "Why did you interpret 'brief' as 3 sentences when I meant 3 paragraphs?"
- "What assumption made you think I wanted JSON when I said 'structured'?"
- "Why did you search the web when I said 'check our records'?"
Now for the uncomfortable part.
- "Why did you interpret 'brief' as 3 sentences when I meant 3 paragraphs?"
- "What assumption made you think I wanted JSON when I said 'structured'?"
- "Why did you search the web when I said 'check our records'?"
Person B reads it cold and states exactly what they'd do. What tools would they use? What would they output?
The Prompter's job: Shut up and take notes on how badly they communicate.
Person B reads it cold and states exactly what they'd do. What tools would they use? What would they output?
The Prompter's job: Shut up and take notes on how badly they communicate.
🎭 Step 1: The Confession Booth
Person A (the "Prompter") writes their precious prompt. Sends it verbatim—no context, no explanations, no "what I really meant was..."—to Person B.
🎭 Step 1: The Confession Booth
Person A (the "Prompter") writes their precious prompt. Sends it verbatim—no context, no explanations, no "what I really meant was..."—to Person B.
90% of business requests for an "AI Agent" can be solved with a simple, robust workflow from Step 2.
The real skill is knowing the difference.
What's a recent 'agent' request you've had to translate back to reality?
90% of business requests for an "AI Agent" can be solved with a simple, robust workflow from Step 2.
The real skill is knowing the difference.
What's a recent 'agent' request you've had to translate back to reality?
The final boss. You give an agent a goal, a set of tools (like APIs), and the freedom to decide how to use them. It has the potential for magic. It also has the potential to get spectacularly, creatively wrong in ways you can't predict.
The final boss. You give an agent a goal, a set of tools (like APIs), and the freedom to decide how to use them. It has the potential for magic. It also has the potential to get spectacularly, creatively wrong in ways you can't predict.
The output of one call becomes the input for the next.
Summarize a review → Extract key complaints → Draft a reply
It’s a sequence of simple tasks that creates a powerful result. Structured, repeatable, and debuggable.
The output of one call becomes the input for the next.
Summarize a review → Extract key complaints → Draft a reply
It’s a sequence of simple tasks that creates a powerful result. Structured, repeatable, and debuggable.
Push to talk, get a response. That's it. Use it for summarizing text, classifying content, or rewriting copy. It's a single, reliable tool for a single job. Master this first.
Push to talk, get a response. That's it. Use it for summarizing text, classifying content, or rewriting copy. It's a single, reliable tool for a single job. Master this first.
If we are to believe Gemini, Google is clearly building for it.
We should be too.
If we are to believe Gemini, Google is clearly building for it.
We should be too.
1. A black box that leaves you shrugging your shoulders.
2. A system that snitches on itself like a guilty teenager, telling you exactly where the internal wiring is crossed.
1. A black box that leaves you shrugging your shoulders.
2. A system that snitches on itself like a guilty teenager, telling you exactly where the internal wiring is crossed.
We're all obsessed with prompt engineering and outputs while building black boxes on quicksand. The real work, the unglamorous grind that separates the demos from deployments, is building systems that can explain failure.
We're all obsessed with prompt engineering and outputs while building black boxes on quicksand. The real work, the unglamorous grind that separates the demos from deployments, is building systems that can explain failure.
"I am escalating this issue, so my internal systems can be fixed."
Now, I had to know more.
"I am escalating this issue, so my internal systems can be fixed."
Now, I had to know more.
What's your go-to BS detector question?
Drop it below - let's build the ultimate filter together.
What's your go-to BS detector question?
Drop it below - let's build the ultimate filter together.
This reveals independent thought. Are they just repeating the latest from Twitter, or do they have a unique, hard-won perspective?
This reveals independent thought. Are they just repeating the latest from Twitter, or do they have a unique, hard-won perspective?
Can they translate jargon into business value? If they can't explain it simply, they don't understand it deeply enough to implement it effectively. Bonus points if they don't use the word "synergy."
Can they translate jargon into business value? If they can't explain it simply, they don't understand it deeply enough to implement it effectively. Bonus points if they don't use the word "synergy."
This is a pure pragmatism test. Can they scope down? Can they deliver value without a six-figure budget? A real expert knows the first step isn't building a rocket. It is seeing if the engine can even start
This is a pure pragmatism test. Can they scope down? Can they deliver value without a six-figure budget? A real expert knows the first step isn't building a rocket. It is seeing if the engine can even start