Bartosz Mikulski
mikulskibartosz.bsky.social
Bartosz Mikulski
@mikulskibartosz.bsky.social
Stop AI Hallucinations in Fintech
Risk Audits + Custom Roadmaps for customer-facing AI
☑️ Creator: laoshu.ai OSS citation checker
📧 Hallucination fire-drill → mikulskibartosz.name
How is this different from proper prompt engineering?
July 15, 2025 at 5:47 AM
Every library should have cursor rules.

I don't know what we do when the rule files fill up the entire context, but until then, every library should have them.
July 15, 2025 at 5:36 AM
Transcripts don't hallucinate
July 15, 2025 at 5:32 AM
I'm building a free, open-source tool for checking the citations in the AI-generated output. False citations should be a solvable problem. laoshu.ai
Laoshu.ai - Automate AI citation verification
Laoshu automatically detects fake AI citations. Expose AI citation fakes instantly with our open-source verification tool.
laoshu.ai
July 15, 2025 at 5:27 AM
It's like building a house on sand and then blaming the roof for leaking.

If you're building RAG systems, "The Map Is Not The Territory – Multimodal Retrieval" video is a masterclass.

Don't miss it.

www.youtube.com/watch?v=hf9B...
The Map Is Not The Territory – Multimodal Retrieval
YouTube video by Hamel Husain
www.youtube.com
July 15, 2025 at 2:31 AM
Here's what matters first:
0. The source data you feed the system
1. How you preprocess and chunk that data
2. The indexing method you choose

If any of that's sloppy, incomplete, or mismatched with how your users query, AI is going to hallucinate. And no clever prompt engineering will save you.
July 15, 2025 at 2:31 AM
Expect to spend 60–80% of your time reading AI responses and judging what’s good or bad.

No automation will save you here. But a clear process will.

In my new article, I break down how to debug RAG systems and LLMs.
If AI quality matters to your org, this guide will save you time and reputation.
July 14, 2025 at 5:26 AM
From the BentoML team, I would expect something more in-depth. This is just scratching the surface.
July 12, 2025 at 1:29 AM
They are all biased towards the owner's opinions. You only notice it when you disagree with those opinions.
July 12, 2025 at 1:24 AM
Would it work for the people who diagnose themselves with Reddit?
July 12, 2025 at 1:22 AM
You don't get "nerd points" for saying, "I coded frontend for an AI chatbot."
July 11, 2025 at 1:57 PM
I know a journalist who wrote over a dozen articles in the style of "I asked ChatGPT what is the best comedy/horror/romance/whatever movie. The answer shocked me."

But wait, it gets worse. Someone reads this...
July 11, 2025 at 1:41 PM
Do those keys work, or just look like keys?
July 11, 2025 at 1:37 PM
It depends on how many engineers prepare the evaluation dataset and run the tests. If 0, you can have 100s of engineers tweaking the prompts and get no expected results.
July 11, 2025 at 1:33 PM