https://liveplusplus.tech
If the platform supports dumping user-mode *and* kernel state, then yes, this would work.
It's just not possible to alter/manipulate kernel state during a replay to rewind to an earlier point in time.
If the platform supports dumping user-mode *and* kernel state, then yes, this would work.
It's just not possible to alter/manipulate kernel state during a replay to rewind to an earlier point in time.
While the latter can detect these races, it has an incredibly high overhead, trace size, and can't replay kernel state (= audio & GPU).
While the latter can detect these races, it has an incredibly high overhead, trace size, and can't replay kernel state (= audio & GPU).
If your process has a benign race somewhere, which does not lead to divergent behaviour, then Echo won't be able to tell you.
If your process has a benign race somewhere, which does not lead to divergent behaviour, then Echo won't be able to tell you.
It's also the reason why I'm starting with PS5, and not Windows, because the API surface is known and limited.
It's also the reason why I'm starting with PS5, and not Windows, because the API surface is known and limited.
I'm developing everything with a complete test-suite in the background, which automatically verifies each and every API interaction, checking all data, return values, etc. against their recorded counterpart.
I'm developing everything with a complete test-suite in the background, which automatically verifies each and every API interaction, checking all data, return values, etc. against their recorded counterpart.
Please repost and share among friends, peers, coworkers - I'm trying to gauge if this has merit to be continued or not.
Introducing "Project Echo", a deterministic Record & Replay tool for PS5:
www.youtube.com/watch?v=K_sd...
Please read the video description and let me know your brutally honest feedback!
Please repost and share among friends, peers, coworkers - I'm trying to gauge if this has merit to be continued or not.
2) Yes, assuming that deterministic CPU code will produce the exact same data streams and commands for the GPU to execute.
2) Yes, assuming that deterministic CPU code will produce the exact same data streams and commands for the GPU to execute.
And you're correct on all accounts, you clearly understood what it is!
And you're correct on all accounts, you clearly understood what it is!
In a replay, the data (e.g. from a network request) will be there at the same time in the same frame as during the recording.
In a replay, the data (e.g. from a network request) will be there at the same time in the same frame as during the recording.
Still a long way to go until I can easily record an UE5 game. Which kind of reminds me of how Live++ started :).
Still a long way to go until I can easily record an UE5 game. Which kind of reminds me of how Live++ started :).
The data is directly streamed to the dev PC and gets compressed afterwards (if you want that). Data is super compressible since most of it is very redundant.
E.g. the trace from the video has 27MB which compress down to 7MB. The recording used 5 threads.
The data is directly streamed to the dev PC and gets compressed afterwards (if you want that). Data is super compressible since most of it is very redundant.
E.g. the trace from the video has 27MB which compress down to 7MB. The recording used 5 threads.
I started with PS5 first because its API surface is known and it doesn't have crazy 3rd-party stuff, CrowdStrike, ...
I started with PS5 first because its API surface is known and it doesn't have crazy 3rd-party stuff, CrowdStrike, ...
The main idea here (and this is why the approach is fundamentally different to others) is that recording is so low-overhead that it should be "always on".
So in my head, every bug ticket now has a recording attached, which is fully debuggable during replay.
The main idea here (and this is why the approach is fundamentally different to others) is that recording is so low-overhead that it should be "always on".
So in my head, every bug ticket now has a recording attached, which is fully debuggable during replay.
Those are figured out at runtime by binary analysis. It's fast enough, and results for sys libs are cached.
Those are figured out at runtime by binary analysis. It's fast enough, and results for sys libs are cached.
For instructions without a lock, you cannot distinguish between atomics and ordinary writes, so I will probably have to resort to sole compiler switch for this.
It's complicated 😀
For instructions without a lock, you cannot distinguish between atomics and ordinary writes, so I will probably have to resort to sole compiler switch for this.
It's complicated 😀
That stuff is hard.
That stuff is hard.
You could then use ThreadSan to dig in.
You could then use ThreadSan to dig in.
I'm serious.
I'm trying to gauge if there's merit to continue working on this.
I'm serious.
I'm trying to gauge if there's merit to continue working on this.