Discussion
NVIDIA Launches Vera CPU, Purpose-Built for Agentic AI
d_silin: It is a 88-core ARM v9 chip, for somewhat more detailed spec.
tencentshill: So does this cut out Intel/x86 from all the massive new datacenter buildouts entirely? They've already lost Apple as a customer and are not competitive in the consumer space. I don't see how they can realistically grow at all with x86.
jauntywundrkind: Given the price of these systems the ridiculously expensive network cards isn't such a huge huge deal, but I can't help but wonder at the absurdly amazing bandwidth hanging off Vera, the amazing brags about "7x more bandwidth than pcie gen 6" (amazing), but then having to go to pcie to network to chat with anyone else. It might be 800Gbe but it's still so many hops, pcie is weighty.I keep expecting we see fabric gains, see something where the host chip has a better way to talk to other host chips.It's hard to deny the advantages of central switching as something easy & effective to build, but reciprocally the amazing high radix systems Google has been building have just been amazing. Microsoft Mia 200 did a gobsmacking amount of Ethernet on chip 2.8Tbps, but it's still feels so little, like such a bare start. For reference pcie6 x16 is a bit shy of 1Tbps, vaguely ~45 ish lanes of that.It will be interesting to see what other bandwidth massive workloads evolve over time. Or if this throughout era all really ends up serving AI alone. Hoping CXL or someone else slims down the overhead and latency of attachment, soon-ish.Maia 200: https://www.techpowerup.com/345639/microsoft-introduces-its-...
bob1029: > It might be 800Gbe but it's still so many hops, pcie is weighty.Once you need to reach beyond L2/L3 it is often the case that perfectly viable experiments cannot be executed in reasonable timeframes anymore. The current machine learning paradigm isn't that latency sensitive, but there are other paradigms that can't be parallelized in the same way and are very sensitive to latency.
rishabhaiover: I'm assuming this is for tool call and orchestration. I didn't know we needed higher exploitable parallelism from the hardware, we had software bottlenecks (you're not running 10,000 agents concurrently or downstream tool calls)Can someone explain what is Vera CPU doing that a traditional CPU doesn't?
gcanyon: Anyone know how this compares to Apple’s M5 chips? Or is that comparison <takes off sunglasses> apples to oranges.
urig: Lots and lots of CPUs pooled. Faster more efficient power RAM accessible to both GPU and CPU. IIUC.
rishabhaiover: But at what stage are we asking for that RAM? if it's the inference stage then doesn't that belong to the GPU<>Memory which has nothing to do with the CPU?I did see they have the unified CPU/GPU memory which may reduce the cost of host/kernel transactions especially now that we're probably lifting more and more memory with longer context tasks.
mixmastamyk: Hmm, the 128-core Ampere Altra CPU is already available, and in a case from System76. I wonder what else differentiates it.If they're going to build CPUs I wish they had used Risc-V instead. They are using it somewhat already.
kibibu: > you're not running 10,000 agents concurrently or downstream tool callsCursor seem to be doing exactly that though
baal80spam: Say what you want about NVIDIA (to me they are just doing what every company would do in their place), but they create engineering marvels.
pdpi: Features like hardware FP8 support definitely make it apples-to-oranges.
dmitrygr: > Purpose-Built for Agentic AIFrom the "fridge purpose-built for storing only yellow tomatoes" and "car only built for people whose last name contains the letter W" series.When can this insanity end? It is a completely normal garden-variety ARM SoC, it'll run Linux, same as every other ARM SoC does. It is as related to "Agentic $whatever" as your toaster is related to it
dpe82: The power and importance of marketing is deeply underappreciated by us technical types.
LogicFailsMe: And yet more than a little Gavin Belson "Box III" vibes here. Fortunately, no signature edition.
alecco: Even Apple hardware looks inexpensive compared to Nvidia's huge premium. And never mind the order backlog.x86 and Apple already sell CPUs with integrated memory and high bandwidth interconnects. And I bet eventually Intel's beancounter board will wake up and allow engineering to make one, too.But competition is good for the market.
storus: Apple went from a high-end PC to a low-end AI provider due to blocking Nvidia on their platform.
pdpi: > It is as related to "Agentic $whatever" as your toaster is related to itThese things have hardware FP8 support, and a 1.8TB/s full mesh interconnect between CPUs and GPUs. We can argue about the "agentic" bit, but those are features that don't really matter for any workload other than AI.
PeterCorless: This is the related benchmark blog from Redpanda [disclosure: I work for Redpanda and I helped write this. Credit to Travis Downs & others at Redpanda for the heavy lifting on the testing and analysis.]https://www.redpanda.com/blog/nvidia-vera-cpu-performance-be...
recvonline: Does this mean their gaming GPUs are becoming less in demand, and therefore cheaper/more available again?
wmf: No.