Soufiane KHIAT
@soufianekhiat.bsky.social
180 followers 290 following 16 posts
Lead Rendering Software Engineer At @EA - @CriterionGames, ex @Unity3d, ex @UbisoftMTL All expressed opinions are my own
Posts Media Videos Starter Packs
soufianekhiat.bsky.social
Not my team at the moment. Check the EA website if you find something.
soufianekhiat.bsky.social
My team is hiring. Let's join Battlefield 6 Rendering Team (Montréal).
jobs.ea.com/en_US/ca...
soufianekhiat.bsky.social
For know didn't do technical communications sorry.
Reposted by Soufiane KHIAT
shiniesandshadows.bsky.social
RDV à 5pm EST / 23h CET avec @beammyup.bsky.social et @yoriculaire.bsky.social pour commencer la lecture de "IBL-BRDF Multiple Importance Sampling for Stochastic Screen-space Indirect Specular" de @soufianekhiat.bsky.social!
Image extraite de l'article sous-titrée Figure 10.1 "Importance sampling using a reflection probe."
On y voit une scène dans Unity dans une pièce moderne avec sol en marbre, mur en béton et une chaise. Une grande lumière rectangulaire entre le plafond et le mur d'où sortent plusieurs rayons verts de debug qui convergent vers un point central sur le sol réfléchissant.
soufianekhiat.bsky.social
I didn't have time to fully update the code for #GPUZen3 on #Unity 6 LTS. So for now I'm sharing illustrations code and Jacobian derivation notebook (#Mathematica).
Plus some illustrations I did not use in the book chapter.
github.com/soufianek...
soufianekhiat.bsky.social
And the final boss of geometrical LOD.
Preserving the appearance of this one can be challenging, I have some ideas, but nothing trivial. I'm open to any idea.
5/5
soufianekhiat.bsky.social
Most of the time, subpixel occluders.
4/5
soufianekhiat.bsky.social
More transparent challenges
3/5
soufianekhiat.bsky.social
Some Specular filtering challenges
2/5
soufianekhiat.bsky.social
#GfxFunFact There is a museum (@CentrePompidou) created for gfx people, with a lot challenges.
Like OIT challenges
1/5
soufianekhiat.bsky.social
#GfxFunFact: In Paris, Sphere ball for Reflection Probe validation are place in grape for convenience.
soufianekhiat.bsky.social
#GfxFunFact: Ikea design this ceiling light cover on purpose to slowdown rendering.
soufianekhiat.bsky.social
Fun fact during Christmas all rendering became slower, here exemples of light count, occluders, and transparent they dare using.
Reposted by Soufiane KHIAT
ghislaingir.bsky.social
For my french community: Here's a cool french-speaking www.twitch.tv/shiniesandsh... channel where they go over recent graphics/rendering academic papers in a gentle step by step manner.
Twitch
Twitch is the world
www.twitch.tv
soufianekhiat.bsky.social
I have a chapter on the next #GPUZen
"IBL-BRDF Multiple Importance Sampling for Stochastic Screen-Space Indirect Specular"
- IBL-MIS-SSSIS -
www.amazon.com/dp/B0DNXNM14K
Reposted by Soufiane KHIAT
vassvik.bsky.social
Let's wrap up this lovely week with a nice technical post

This is the "case study" from my Masterclass at GPC, where I apply a series of optimizations to improve the effective bandwidth of a 3x3x3 blur (a proxy for a huge set of operations on volumetric data)

Check ALT text for (a lot of) context.
A table showing the experimental results of applying 6 different compute shader versions of a simple 3x3x3 box blur on a 512x512x512 texture using either GL_R16F or GL_R32F internal format for storage for a eight different GPUs spanning several GPU architectures and vendors. 

The upper table shows the absolute effective bandwidth (measured as the sum of total bytes read and written divided by execution time), whereas the lower table shows the effective bandwidth relative to the theoretical bandwidth as a percentage. 

Each row corresponds to a specific shader variant (except for the "theoretical" row, which displays the theoretical bandwidth according to the GPU specification), and each column corresponds to a specific GPU. The color coding is per column in the upper table, and it's a single color coding on the entire lower table. 

Each version will be explained in detail in the subsequent posts.

Version 6 applies uses half precision floating point for the shared memory cache, and the relevant extension does not exist in the Intel drivers for Windows. Likewise this version is not applied to the GL_R32F internal format benchmarks since that would destroy the precision of the backing format anyway. 

The code was written and initially tested on a desktop 4090 (the first column), which naturally skews the results a bit since everything was evaluated and tested on that GPU. Had I used another GPU I might have picked slightly different compromises, and the results would have been slightly different. 

One interesting observation is that the RTX 4000 series (Ada Lovelace architecture) significantly overperform everything else, with 7900 XTX (RDNA3) slightly behind. A large part of these overwhelmingly efficient results is due to the massive caches these devices sport (72 MiB on the desktop 4090, 64 MiB on the laptop 4090, etc.), which really helps reach peak bandwidth a lot easier.
Reposted by Soufiane KHIAT
kostasanagnostou.bsky.social
"Occupancy Explained" GPC 2024 presentation slides are now available online: gpuopen.com/presentation..., they are missing notes though so to fill in the gaps I recommend this great post from 2023: gpuopen.com/learn/occupa...
gpuopen.com