Discussion
Introducing zml-smi
rdyro: Looks cool!nvtop can actually support TPUs too via https://github.com/rdyro/libtpuinfo/ https://github.com/Syllo/nvtop/blob/76890233d759199f50ad3bdb...
mrflop: Renaming fopen64 to intercept library calls feels like a brittle hack masquerading as "sandboxing." Why not just upstream this hardware support to nvtop instead of fragmenting the ecosystem?
marwanet: If this logic were pushed into nvtop, wouldn't the codebase become unmaintainable? Each vendor's interception method is going to be different.
steeve: sadly, sandboxing is something that can't be upstreamed. this way, sandboxing is kept in zml instead of patching mesa.as for nvtop, great program, but we missed a few features (such as sandboxing)
pstuart: It looks cool and I was excited to get monitoring for the NPU on my Ryzen AI 395+, unfortunately it does not show. NPU support in linux really seems to be an afterthought.
steeve: Weird, because we tried it. It doesn’t show anything?We use the amdsmi to get metrics. I’ll investigate.
152334H: "NPU" seems to refer to trainium only?
serialx: Look into all-smi https://github.com/lablup/all-smi It supports all GPUs thinkable including Apple Silicon and many AI accelerator cards.