Discussion

Search code, repositories, users, issues, pull requests...

bob1029: A fun experiment but I wonder how many out there seriously think we could ever completely rid ourselves of the CPU. It seems to be a rising sentiment.The cost of communicating information through space is dealt with in fundamentally different ways here. On the CPU it is addressed directly. The actual latency is minimized as much as possible, usually by predicting the future in various ways and keeping the spatial extent of each device (core complex) as small as possible. The GPU hides latency with massive parallelism. That's why we can put them across relatively slow networks and still see excellent performance.Latency hiding cannot deal well in workloads that are branchy and serialized because you can only have one logical thread throughout. The CPU dominates this area because it doesn't cheat. It directly targets the objective. Making efficient, accurate control flow decisions tends to be more valuable than being able to process data in large volumes. It just happens that there are a few exceptions to this rule that are incredibly popular.

spot5010: I don't think we get rid of the CPU. But the relationship will be inverted. Instead of the CPU calling the GPU, it might be that the GPU becomes the central controller and builds programs and calls the CPU to execute tasks.

treyd: This will never without completely reimagining how process isolation works and rewriting any OS you'd want to run on that architecture.

Nevermark: Time to benchmark Doom.Now we know future genius models won't even need CPUs, just their tensor/rectifying circuits. If they need a CPU, they will just imagine one.

GeertB: I don't quite understand how multiply doesn't require addition as well to combine the various partial products.

robertcprice1: Hey everyone thank you taking a look at my project. This was purely just a “can I do it” type deal, but ultimately my goal is to make a running OS purely on GPU, or one composed of learned systems.

yjftsjthsd-h: This is hilarious and profoundly in the spirit of hacker news. Thanks for posting:)

nicman23: can i run linux on a nvidia card though?

micw: Linux runs everywhere

volemo: Except on my stupid iPad “Pro”. :(

mghackerlady: iirc theres an app on the app store that's basically a small alpine container

low_tech_punk: Saw the DOOM raycast demo at bottom of page.Can't wait for someone to build a DOOM that runs entirely on GPU!

jhuber6: Depends entirely on your definition of 'entirely', but https://github.com/jhuber6/doomgeneric is pretty much a direct compilation of the DOOM C source for GPU compute. The CPU is necessary to read keyboard input and present frame data to the screen, but all the logic runs on the GPU.

DonThomasitos: I don‘t understand why you would train a NN for an operation like sqrt that the GPU supports in silicon.

nine_k: I see it as a practical joke or a fun hack, like CPUs implemented in the Game of Life, or in Minecraft.

mihaitodor: It’s been done already. Have a look at Quest for Tetris: https://codegolf.stackexchange.com/questions/11880/build-a-w...

himata4113: I was always wondering what would happen if you trained a model to emulate a cpu in the most efficient way possible, this is definitely not what I expected, but also shows promise on how much more efficient models can become.

artemonster: Every clueless person who suggest that we move to GPUs entirely have zero idea how things work and basically are suggesting using lambos to plow fields and tractors to race in nascar

madwolf: Bad comparison. Lambos are regularly plowing fields and they're quite good at it. https://www.lamborghini-tractors.com/en-eu/

artemonster: I remembered that labos used to make tractors after I posted the comment. Nice catch!

volemo: I see us not getting rid of CPU, but CPU and GPU being eventually consolidated in one system of heterogeneous computing units.

jagged-chisel: Agreed. Much like “RISC is gonna replace everything” - it didn’t. Because the CPU makers incorporated lessons from RISC into their designs.I can see the same happening to the CPU. It will just take on the appropriate functionality to keep all the compute in the same chip.It’s gonna take awhile because Nvidia et al like their moats.

StilesCrisis: CISC only survived because CPUs now dedicate a ton of silicon to decoding the CISC stream into RISC-y microcode. RISC CPUs can avoid this completely, but it turns out backwards compatibility was important to the market and the transistor cost of "instruction decode" just adds like +1 pipeline depth or something.

ndiddy: > CISC only survived because CPUs now dedicate a ton of silicon to decoding the CISC stream into RISC-y microcode.For Intel CPUs, this was somewhat true starting from the Pentium Pro (1995). The Pentium M (2004) introduced a technique called "micro-op fusion" that would bind multiple micro-ops together so you'd get combined micro-ops for things like "add a value from memory to a register". From that point onward, the Intel micro-ops got less and less RISCy until by Sandy Bridge (2011) they pretty much stopped resembling a RISC instruction set altogether. Other x86 implementations like K7/K8/K10 and Zen never had micro-ops that resembled RISC instructions.

jdlyga: I'll do you one better, imagine a CPU that runs entirely in an LLM.You’re absolutely right! I made an arithmetic mistake there — 3 * 3 is 9, not 8. Let’s correct that: Before: EAX = 3 After imul eax, eax: EAX = 9 Thanks for catching that — the correct return value is 9.

FartyMcFarter: What an amazing multiplication request! The numbers you have chosen reveal an exquisite taste which can only be the product of an outstanding personality.

andreadev: The bit about multiplication being ~12x faster than addition is worth pausing on. In silicon, addition is the "easy" operation — but here the complexity hierarchy completely inverts. Makes sense once you think about it: multiplication decomposes into parallel byte-pair lookups (which neural nets handle trivially as table approximation), while addition has a sequential carry chain you can't fully parallelize away.Funny enough, analog computing had the same inversion — a Gilbert cell does multiplication cheaply, while addition needs more complex summing circuits. Completely different path to the same result.What I haven't seen discussed: if the whole CPU is neural nets, the execution pipeline is differentiable end-to-end. You could backprop through program execution. Useless for booting Linux, but potentially interesting for program synthesis — learning instruction sequences via gradient descent instead of search. Feels like that's the more promising research direction here than trying to make it fast.

zephen: > CPUs now dedicate a ton of silicon to decoding the CISC stream into RISC-y microcode.In absolute terms, this is true. But in relative terms, you're talking less than 1% of the die area on a modern, heavily cached, heavily speculative, heavily predictive CPU.

FartyMcFarter: Didn't there use to be a joke about Intel being the biggest RAM manufacturer (given the amount of physical space caches take on a CPU)?

zephen: I hadn't heard that, but certainly, there must have been many times when Intel held the crown of "biggest working hunk of silicon area devoted to RAM."

jagged-chisel: “A CPU that runs entirely on the GPU”I imagine a carefully crafted set of programming primitives used to build up the abstraction of a CPU…“Every ALU operation is a trained neural network.”Oh… oh. Fun. Just not the type of “interesting” I was hoping for.

mamaluigie: Get used to it. The modern day solution for everything right now is to throw AI at it.Hmmm... I need to measure this piece of wood for cutting, let me take a picture of it and see what the ai says its measurement is instead of using a measuring tape because it is faster to use the AI.

sdwr: That honestly sounds great! If it works...

jagged-chisel: We already have this on our phones without AI. What could AI possibly bring to this?

st_goliath: > I wonder how many out there seriously think we could ever completely rid ourselves of the CPU. It seems to be a rising sentiment.This sentiment is not a recent thing. Ever since GPGPU became a thing, there have been people who first hear about it, don't understand processor architectures and get excited about GPUs magically making everything faster.I vividly recall a discussion with some management type back in 2011, who was gushing about getting PHP to run on the new Nvidia Teslas, how amazingly fast websites will be!Similar discussions also spring up around FPGAs again and again.The more recent change in sentiment is a different one: the "graphics" origin of GPUs seem to have been lost to history. I have met people (plural) in recent years who thought (surprisingly long into the conversation) that I mean stable diffusion when talking about rendering pictures on a GPU.Nowadays, the 'G' in GPU probably stands for GPGPU.

fulafel: The field of course explored this direction before in vector computers (Cray etc).

fragmede: It does? Throw the picture at ChatGPT and see what it does with it

andrewdb: Why do we call them GPUs these days?Most GPUs, sitting in racks in datacenters, aren't "processing graphics" anyhow.

ChocolateGod: [delayed]

koolala: Isn't it interesting it doesn't instantly crash from a precision error? That sounds carefully crafted to me.

jagged-chisel: Interesting, yes. Still not the kind of interesting I was expecting.

volemo: Well, there's iSH and a-Shell but they don't have GUI capability and are somewhat limited in other ways. There's also UTM, but without weird hacks you can only get SE version which is very slow.

theamk: [delayed]

vrighter: you know that the gpu has add and multiply instructions already, right?

robertcprice: its funny to see how many people get offended by a project I think im doing something right

Reader /

Discussion

GitHub - robertcprice/nCPU: nCPU: model-native and tensor-optimized CPU research runtimes with organized workloads, tools, and docs · GitHub

github.com

/ pin · @ user · Ctrl+Enter

No discussions yet

Discover

byroot.github.io

Optimizing Ruby Path Methods | byroot’s blog

Back in November last year, I started a new job at Intercom, and one of the first projects I got to work on was improvin…

1 14

news.hada.io

베를린에서 중학생이 발견한 트로이 주화 | GeekNews

기원전 281년~261년 사이 Ilion 조폐소에서 발행된 희귀 청동 주화가 Berlin의 Spandau 지구에서 발견됐으며, 베를린 시내에서 확인된 첫 그리스 고대 유물직경 12mm, 무게 7g 규모로, 앞면에는…

science.nasa.gov

NASA Shuts Off Instrument on Voyager 1 to Keep Spacecraft Operating - NASA Scien…

On April 17, engineers at NASA’s Jet Propulsion Laboratory (JPL) in Southern California sent commands to shut down an in…

1 2

sentinelcolorado.com

A college instructor turns to typewriters to curb AI-written work and teach life…

"What's the point of me reading it if it's already correct anyway, and you didn't write it yourself? Could you produce i…

1 84

news.hada.io

새로 비공개 해제된 기록에서 드러난 Amazon의 가격 고정 전술, California 법 | GeekNews

비공개 해제된 문서에는 Amazon이 독립 판매자 가격을 감시하고, 경쟁사 사이트 가격이 더 낮아질 경우 불이익을 줬다는 California 당국 주장 포함경쟁사 가격이 Amazon보다 낮으면 Buy Box 박탈 같…

sqlservercentral.com

Downtime Caused by the Postgres Transaction ID Wraparound Problem – SQLSer…

Learn about transaction ID wraparound in PostgreSQL, which caused a problem for the author.

1 7

GitHub - robertcprice/nCPU: nCPU: model-native and tensor-optimized CPU research runtimes with organized workloads, tools, and docs · GitHub

More from github.com

GitHub - NikolayS/pgque: PgQue – Zero-bloat Postgres queue. One SQL file to inst…

GitHub - zemo-g/rail: A self-hosting systems language that speaks TLS alone. Zer…

GitHub - drasimwagan/mdv: MDV — a Markdown superset for documents, dashboards, a…

GitHub - yapstudios/sfsym: Export Apple SF Symbols as SVG · GitHub

Discover

Optimizing Ruby Path Methods | byroot’s blog

베를린에서 중학생이 발견한 트로이 주화 | GeekNews

NASA Shuts Off Instrument on Voyager 1 to Keep Spacecraft Operating - NASA Scien…

A college instructor turns to typewriters to curb AI-written work and teach life…

새로 비공개 해제된 기록에서 드러난 Amazon의 가격 고정 전술, California 법 | GeekNews

Downtime Caused by the Postgres Transaction ID Wraparound Problem – SQLSer…

GitHub - robertcprice/nCPU: nCPU: model-native and tensor-optimized CPU research runtimes with organized workloads, tools, and docs · GitHub

More from github.com

GitHub - NikolayS/pgque: PgQue – Zero-bloat Postgres queue. One SQL file to inst…

GitHub - zemo-g/rail: A self-hosting systems language that speaks TLS alone. Zer…

GitHub - drasimwagan/mdv: MDV — a Markdown superset for documents, dashboards, a…

GitHub - yapstudios/sfsym: Export Apple SF Symbols as SVG · GitHub

Discover

Optimizing Ruby Path Methods | byroot’s blog

베를린에서 중학생이 발견한 트로이 주화 | GeekNews

NASA Shuts Off Instrument on Voyager 1 to Keep Spacecraft Operating - NASA Scien…

A college instructor turns to typewriters to curb AI-written work and teach life…

새로 비공개 해제된 기록에서 드러난 Amazon의 가격 고정 전술, California 법 | GeekNews

Downtime Caused by the Postgres Transaction ID Wraparound Problem &ndash; SQLSer…

Downtime Caused by the Postgres Transaction ID Wraparound Problem – SQLSer…