Ruben Osorio
osor.io
Ruben Osorio
@osor.io
Senior VFX/Graphics Programmer working @ Rockstar Games

All posts/opinions/views my own :)
My favourite flavour of this is “x86 is interpreted”. Such a high ragebait hit-rate 😜
September 18, 2025 at 7:36 AM
They’ll have to pry the idea of accumulating samples from my cold dead hands 😝
July 12, 2025 at 10:33 PM
Thanks so much Albert! 🧡

🟥🟩🟦 Give us the sub-pixels!!! 🟥🟩🟦
July 12, 2025 at 7:51 PM
I don't have plans to do it right now but maybe at some point in the future if time allows 😊

Part of the message I'm trying to send here is that this isn't too hard to do. I'd love for people to have a go with their own implementations and share improvements if they can! 🧡
June 28, 2025 at 6:36 PM
It’s an honour to be featured 🤗 Thank you so much! ❤️
June 18, 2025 at 4:48 PM
Screenshots not a great strength either I see 😅

The high-res stuff is in the post anyway, like this one with a bunch of different fonts rendered at high res straight out of the test app I was running this on:

osor.io/text/lorem_i...
June 12, 2025 at 10:35 PM
Oh no! 😝
June 12, 2025 at 9:45 PM
Video is still not Bluesky's forte eh? Here's a screenshot! 🙃
June 12, 2025 at 4:46 PM
My first one one got an unexpected amount of interest. Huge thanks to everyone who read it! (Especially @jendrikillner.bsky.social since he was probably the biggest reason 😄)

This topic gets way more coverage but I've never seen it done/presented like this, so trying to make my contribution 🙏
June 12, 2025 at 4:26 PM
The most sensible approach is obviously that half res and quarter res *both* mean half in each axis / quarter of the pixels.

🧨🧨🧨🧨🧨🧨🧨
May 22, 2025 at 8:30 AM
Hey! Thanks so much! 🙌

Unfortunately it was a one-off build since I don’t have much time these days for this kind of project 😔

I would encourage people to attempt their own custom builds though. Or look for custom controller/arcade-stick builders directly since there’s some already out there 🙏
May 10, 2025 at 12:57 AM
Thanks man! 🙏
May 6, 2025 at 8:54 PM
Gracias Andrés! ❤️
May 6, 2025 at 8:52 PM
Brain played it with the exact cadence and slapped the music right after 🙃
March 6, 2025 at 6:29 AM
The first active thread of the wave does the atomic and retrieves the global offset for the wave, WaveReadLaneFirst then broadcasts it.

The local offset within the wave comes from WavePrefixCountBits, since it's just the count of how many threads with a lower index are also writing one element.
January 1, 2025 at 3:11 PM
If you need the correct index per-thread, as you do when you're going to write the samples to the buffer, there's some more wave-ops involved, since you also need to calculate the local offset for each thread on the wave.

WaveReadLaneFirst/WavePrefixCountBits sorts you out, here is how it'd look:
January 1, 2025 at 3:11 PM
Oh! Also worth mentioning. In this sort of system you'll see a lot of contention when writing to shared counters.

It's a good idea to minimize this by doing the global write once per wave or group.

A neat trick is to also scalarize on the shader/draw for when a wave sees different values there 🙃
January 1, 2025 at 3:11 PM
Paying my respects with a video rendering 10% of the pixels each frame (hacking this in just now so turning all denoising and TAA off, no reprojection of "empty" pixels either 😜).

(Prepare for the bsky video butchering though)
December 31, 2024 at 2:37 PM
@adrien-t.bsky.social also made me aware of @h3r2tic.bsky.social's amazing presentation in h3.gd/a-deferred-m.... Super cool to see the per-draw lists and all the spatial and temporal VRS experiments ❤️
A deferred material rendering system
Technobabble and nonsense
h3.gd
December 31, 2024 at 2:23 PM
And because this approach ends up compacting the list of pixels per-draw, it responds really well to scenes with heavy dithering into the visibility buffer.

Some of the other tile-based approaches I tried struggled with this, since they'd need to dispatch multiple resolve tiles per tile on-screen.
December 31, 2024 at 2:12 PM
Plus you also can select variations of the shaders to further optimize!

If a pixel is not seeing any local lights, you can dispatch a version of the resolve shader that has all the local light code compiled out. Or if a pixel is fully in shadow from directional light, nuke all that code too, etc...
December 31, 2024 at 2:12 PM
I *really* like the flexibility of this approach while keeping the resolve waves full.

With this you can do software VRS easily, both spatially and temporally, which is super cool! You can just write any logic to decide how the visibility buffer values map to a sample to resolve.
December 31, 2024 at 2:12 PM
There's only a few waves per-frame that see two or more draws when resolving, which are the only cases where the waves aren't fully utilized.

This is as good as it can be anyway, if those waves weren't going to shade another draw, they would have been inactive in a "wave-perfect" resolve anyway.
December 31, 2024 at 2:12 PM