Stefan Reinalter
banner
molecularmusing.bsky.social
Stefan Reinalter
@molecularmusing.bsky.social
Founder of Molecular Matters • C++ & low-level programming • Created Live++ (@liveplusplus.bsky.social)
https://liveplusplus.tech
Not 100% the same, but almost.
If the platform supports dumping user-mode *and* kernel state, then yes, this would work.

It's just not possible to alter/manipulate kernel state during a replay to rewind to an earlier point in time.
February 13, 2026 at 12:57 PM
This is a trade-off between low-overhead, always-on recording & replay across all cores (= Echo) vs. instruction-level tracing and single-core emulation like TTD.
While the latter can detect these races, it has an incredibly high overhead, trace size, and can't replay kernel state (= audio & GPU).
February 13, 2026 at 12:55 PM
If your process has a true race, which *does* lead to divergent behaviour, Echo will detect and report this, but cannot pinpoint exactly where that happens - just that there's a race, somewhere.
February 13, 2026 at 12:53 PM
Generally speaking, sync. primitives will replay in the exact same order. So if during recording thread 2 looked mutex C first, then the same will happen during replay.
If your process has a benign race somewhere, which does not lead to divergent behaviour, then Echo won't be able to tell you.
February 13, 2026 at 12:52 PM
This can't rule out that I'm doing something stupid somewhere, but should give good test coverage.
It's also the reason why I'm starting with PS5, and not Windows, because the API surface is known and limited.
February 13, 2026 at 9:04 AM
Absolutely, people need to trust this thing.

I'm developing everything with a complete test-suite in the background, which automatically verifies each and every API interaction, checking all data, return values, etc. against their recorded counterpart.
February 13, 2026 at 9:03 AM
Unfortunately, not with this approach. It replays everything using real APIs, which manipulate kernel state, so that part can never be rewound.
February 13, 2026 at 8:52 AM
Thanks Alex, truly means a lot!
February 13, 2026 at 8:51 AM
Reposting this for the European crowd.
Please repost and share among friends, peers, coworkers - I'm trying to gauge if this has merit to be continued or not.
I've been working on something new since August 2025 and have decided to finally spill the beans:
Introducing "Project Echo", a deterministic Record & Replay tool for PS5:
www.youtube.com/watch?v=K_sd...

Please read the video description and let me know your brutally honest feedback!
"Project Echo" early pre-alpha footage
YouTube video by Molecular Matters
www.youtube.com
February 13, 2026 at 8:11 AM
1) Rewind is not possible with that approach, unfortunately.
2) Yes, assuming that deterministic CPU code will produce the exact same data streams and commands for the GPU to execute.
February 13, 2026 at 6:42 AM
Thanks Arseny!

And you're correct on all accounts, you clearly understood what it is!
February 13, 2026 at 6:40 AM
Yes, this captures everything that's non-deterministic, including network.
In a replay, the data (e.g. from a network request) will be there at the same time in the same frame as during the recording.
February 13, 2026 at 6:30 AM
Thanks!

Still a long way to go until I can easily record an UE5 game. Which kind of reminds me of how Live++ started :).
February 12, 2026 at 10:14 PM
All correct.
February 12, 2026 at 10:11 PM
Oh, and the 27MB contain the raw PCM data for the music. If you leave that out, it's much much smaller, in the high KB range for that recording (uncompressed).
February 12, 2026 at 10:10 PM
I updated the video description so that it hopefully answers most questions I received so far. I'm not familiar with gfxreconstruct unfortunately.
February 12, 2026 at 9:30 PM
Correct.
The data is directly streamed to the dev PC and gets compressed afterwards (if you want that). Data is super compressible since most of it is very redundant.
E.g. the trace from the video has 27MB which compress down to 7MB. The recording used 5 threads.
February 12, 2026 at 9:29 PM
That's the end goal.
I started with PS5 first because its API surface is known and it doesn't have crazy 3rd-party stuff, CrowdStrike, ...
February 12, 2026 at 9:16 PM
Thanks for the feedback!
The main idea here (and this is why the approach is fundamentally different to others) is that recording is so low-overhead that it should be "always on".
So in my head, every bug ticket now has a recording attached, which is fully debuggable during replay.
February 12, 2026 at 8:56 PM
Yes, because I also patch functions from system libraries for which I don't have symbols, not even the executable that's loaded.
Those are figured out at runtime by binary analysis. It's fast enough, and results for sys libs are cached.
February 12, 2026 at 8:23 PM
They are captured directly in case they have a lock prefix.
For instructions without a lock, you cannot distinguish between atomics and ordinary writes, so I will probably have to resort to sole compiler switch for this.
It's complicated 😀
February 12, 2026 at 8:22 PM
No, at runtime.
February 12, 2026 at 8:06 PM
I was thinking about offering two modes eventually. One what I have, the other one full instruction level tracing for finding such bugs as well, but certainly not in the first version.
That stuff is hard.
February 12, 2026 at 8:06 PM
That won't be captured or detected directly, but if it leads to non-deterministic behaviour down the road, that divergence will be detected and reported.
You could then use ThreadSan to dig in.
February 12, 2026 at 7:59 PM
Please ask any questions you might have once you have recovered.
I'm serious.
I'm trying to gauge if there's merit to continue working on this.
February 12, 2026 at 7:47 PM