inmutel Mania
banner
lmrx114514.bsky.social
inmutel Mania
@lmrx114514.bsky.social
やったぜ。クリアな視界
Reposted by inmutel Mania
Diamond Rapids用AMX命令群(AMX-{MOVRS,AVX512,FP8,TF32,TRANSPOSE})に対応したXbyak 7.26 release。
github.com/herumi/xbyak
GitHub - herumi/xbyak: A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2
A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2 - herumi/xbyak
github.com
June 9, 2025 at 11:24 PM
Reposted by inmutel Mania
AVX10.2がrev 4.0でYMMレジスタの埋め込み丸めやsae/erなどの仕様が削除されたけどXbyakはまだ残ってた(xed 9.53も残ってる)けど、何かとトラブルになるのでその機能を削除してv7.27をリリースした。
github.com/herumi/xbyak...
github.com
July 2, 2025 at 1:42 AM
Reposted by inmutel Mania
XbyakにSolaris対応のちょっとしたpull reqが来たのだけど、Solarisってまだがんばってるんだ。知らなかった。大学のときに触ってた以来だなあ。
July 24, 2025 at 11:54 PM
Reposted by inmutel Mania
定数除算最適化再考3 コンパイラを越えろ
zenn.dev/herumi/artic...
x64/M4でアセンブリ言語レベルでの試行錯誤の結果を書きました。
定数除算最適化再考3 コンパイラを越えろ
zenn.dev
August 12, 2025 at 7:22 AM
Reposted by inmutel Mania
Why limit it to Intel? :)
October 3, 2025 at 8:44 PM
Reposted by inmutel Mania
Let me add another bit to the rant. I am sick as shit of sites, youtube idiots, and the rest puking back obvious BS without the bare minimum of checking. 3 sites said it so it has to be true! You are being used by people with ulterior motives, don't be a tool. No hope for society....
October 3, 2025 at 7:54 PM
Reposted by inmutel Mania
Is #AMD really fabbing at #Intel? We got the definitive story. Really, not clickbait like the others, we got to the bottom of it!
www.semiaccurate.com/2025/10/03/i...
Is AMD fabbing at Intel Foundry?
A few days ago, a financial note went around saying AMD is going to fab chips at Intel Foundry.
www.semiaccurate.com
October 3, 2025 at 7:38 PM
Reposted by inmutel Mania
September 30, 2025 at 10:45 PM
Reposted by inmutel Mania
With the changes today, I am now FIRMLY back in the "Intel will die" camp. The company is avoiding the problem source and addressing the symptoms. Badly. In a way that will worsen the problem. I plan on writing this up as soon as I get free time, tomorrow is shot though. More soon.
September 8, 2025 at 10:26 PM
Reposted by inmutel Mania
Remember how people laughed when I said, in 2019, that I saw a clear path to #Intel failing? I wasn't joking. Then Pat Gelsinger came in and addressed the root of the problem turning things around.
September 8, 2025 at 10:26 PM
Reposted by inmutel Mania
Those things you cite had a different purpose, basically to distract Wall Street from viewing Intel as not a player in 'hot' markets. Nothing more nothing less. I can name a dozen others too.
September 9, 2025 at 8:20 PM
Reposted by inmutel Mania
Size wasn't the issue, culture was. Pat fixed that mostly, but will those changes stick?
September 9, 2025 at 3:08 PM
Reposted by inmutel Mania
Very interesting to me that it will only be a 6 X increase in DGEMM.

Fugaku is famously under provisioned for low precision flops, I must wonder if this is an over correction?
Huawei Unified Bus bsky.app/profile/ogaw...
=>

「富岳NEXT」開発体制始動記念式典及び記者会見、2025年8月22日 www.youtube.com/watch?v=VPQY...
S. Matusoka www.r-ccs.riken.jp/wp/wp-conten...
(32枚) bsky.app/profile/ogaw...
近藤正章 www.r-ccs.riken.jp/wp/wp-conten...
FUJITSU-MONAKA-X: 1.4nm

NVLink Fusion bsky.app/profile/ogaw...
September 8, 2025 at 12:18 PM
Reposted by inmutel Mania
Look what just landed in the lab
September 12, 2025 at 2:36 PM
Reposted by inmutel Mania
Here is an early mention of 512b vectors in a 2009 #Intel #Nehalem optimization slide:
August 30, 2025 at 8:16 PM
Reposted by inmutel Mania
Well... AVX512 was known as AVX3 at some point... (and AVX512F was AVX3.1 for KNL guess) .. SKX was supposed to be AVX3.2 (F, CD, BW, DQ, and VL)
September 5, 2025 at 6:04 PM
Reposted by inmutel Mania
It's too bad that this AVX3.1 nomenclature disappeared. I think it is much more seamless, like SSE4.2, SSE5 (original name of AMD XOP) or Armv8.2. AMX2 and AMX3.1 would be also better.
Well... AVX512 was known as AVX3 at some point... (and AVX512F was AVX3.1 for KNL guess) .. SKX was supposed to be AVX3.2 (F, CD, BW, DQ, and VL)
September 6, 2025 at 4:39 PM
Reposted by inmutel Mania
#Intel xAPIC depreciation plan 1.0:
www.intel.com/content/www/...
September 12, 2025 at 7:56 AM
Reposted by inmutel Mania
#Intel refreshed the xAPIC depreciation plan with #NovaLake and #DiamondRapids:
September 19, 2025 at 11:55 AM
Reposted by inmutel Mania
There are a few working #PantherLake B0_2 among #Intel test machines: (CPUID C06C2, 12c/12t (4P+4E+4LPE probably), 3000 MHz, no HTT, no AVX512, Intel 18A)
intel-gfx-ci.01.org/tree/intel-x...
#CougarCove #Darkmont
For comparison, #LunarLake was 3100MHz (8c/8t 4P+4LPE) at similar stage
August 8, 2025 at 11:20 AM
Reposted by inmutel Mania
#Intel microcode refresh 20250812:
github.com/intel/Intel-...
Release Notes:
github.com/intel/Intel-...
August 13, 2025 at 10:41 AM
Reposted by inmutel Mania
#AMD refreshed the "AMD64 Architecture Programmer's Manual, Volumes 1" 24592 pdf to v3.23 with #AVX512
docs.amd.com/v/u/en-US/24...
August 14, 2025 at 9:01 AM
Reposted by inmutel Mania
New story up on #ARM's Neural Super Sampling Tech. Not deep, waiting on hardware details.
www.semiaccurate.com/2025/08/12/a...
ARM unveils GPU Neural Unit and Neural Super Sampling Tech
Today ARM is announcing it’s Neural Unit and various GPU upscaling technologies that use it.
www.semiaccurate.com
August 12, 2025 at 1:10 PM
Reposted by inmutel Mania
Kicking off the EUMaster4HPC FPGA workshop with Intel oneAPI on #MeluXina! Huge thanks to @luxprovide.bsky.social, and @uni.lu for making this happen. On today's agenda: Hands-on learning with @eurohpc-ju.bsky.social and Luxembourg’s national supercomputer!

#HPC #Supercomputing #oneAPI #FPGA
June 6, 2025 at 7:48 AM
Reposted by inmutel Mania
seems intel has since redacted this section...
What is #Intel #VPMM (Vector Extension Packed Matrix Multiplication)? CPUID.24h.ECX=1.ECX[0]
From the latest TDX 1.5 specification (348549-006US pdf), p. 116.
cdrdv2-public.intel.com/853286/intel...
cc: @fclc.bsky.social
July 11, 2025 at 4:48 PM