Comment by touisteur - Hacker Neue

touisteur 4 days ago parent

People have been trying to bypass CUDA and even PTX for a long time. One long rundown of optimizing gemm on NVIDIA hardware (https://salykova.github.io/sgemm-gpu) mentions 'maxas' (https://github.com/NervanaSystems/maxas/wiki/Introduction) - which was really a step forward in this space. I still blame Intel (buying NervanaSystems) for killing it...

almostgotcaught 4 days ago

> People have been trying to bypass CUDA and even PTX for a long time

i swear it's so funny when people talk about this stuff like it's all weird/surprising. y'all realize that there are hundreds (thousands?) of engineers across FAANG whose full time job is optimizing CUDA/ROCm/whatever code for their team/org/company's specific workloads? like do y'all think that serious shops really just go with whatever the vendor gives you? ie none of this is in the least surprising - it's completely expected that whatever the vendor designs generically for the entire market segment will fail to achieve peak perf for your use case.

cma 4 days ago

>it's completely expected that whatever the vendor designs generically for the entire market segment will fail to achieve peak perf for your use case.

When Carmack left Meta I believe he claimed they were only getting around 20% utilization on their even then enormous GPU fleet. So I could see them also leaving a lot of perf headroom on the table.

touisteur OP 4 days ago

Not saying it's surprising. My day job is doing exactly this, not in any FAANG.

Working on a platform that hides so many low-level details is a challenge, and the fact people have to go to such length to get access to it is noteworthy. 'maxas' was noteworthy and unneeded on many (most ?) other platforms.

Not saying Intelstuff or armstuff is 'easier' but at least you get access and are tooled to work on the actual low-level asm.

almostgotcaught 4 days ago

> and the fact people have to go to such length to get access to it is noteworthy

I'll repeat myself: no it's not. There's nothing noteworthy about it at all. In fact I literally cannot fathom why anyone ever expects or expected otherwise. Is it because the oft-repeated notion of "abstraction"? I guess I must be the sole programmer that has always known/understood, even from the first intro class, that abstractions are just assumptions and when those assumptions don't hold I will need to remove the abstraction.

This item has no comments currently.