Discussion
jauntywundrkind: As someone who has spent probably a percent or more of my life working on or thinking about state, and how it could or should just be decomposed 9p files when possible, about externalizing state & opening up new frontiers of scripting, and how git can tie that together and let us build new distributed systems, I am cheering. Cheering wildly.The zig wasm sounds so so good. I've enjoyed git on rust via gitoxide ( https://github.com/gitoxidelabs/gitoxide ) but haven't tried wasm yet. I rather expect gitoxide/rust would be bigger. The ability to really control memory like they talk of here seems like it could be a huge advantage for wasm inter-op across a SharedArrayBuffer (or like) holding the code too. Rust seems unlikely to be able to offer that.The ArtifactFS fuse driver sounds wonderful. My LLM session to build an OCI storage driver is already begun!On another note, this gives me all kinds of feels:> Inside Cloudflare, we’re using Artifacts for our internal agents: automatically persisting the current state of the filesystem and the session history in a per-session Artifacts repo.On a personal level I find this amazing & incredible & I love it.But reciprocally this feels like an incredibly difficult social change. To collect all the work, to collectivize the thought processes / thought making.I am so enamoured with LLM programming. And I have so wanted engineering to better be able to externalize the tale of what happened, what did we do. But this also feels like there is no privacy, that this raw data is deeply deeply deeply personal.I feel so so so good about this & so scared too. I want very much to work more in public, but I also want some refuse, some space of my own. We lost offices for cubicles, and now we lose the sanctity of our own screens too? I both want to share, so much, to have shared means of thinking, but via more consensual deliberate means, please.
tln: Ooh, this looks great!The usage costs are rather high compared to S3 - 30x higher PUT/POST. It looks like batching operations is going to be vital.
a_t48: The ArtifactFS thing looks neat. I love this thing: https://github.com/cloudflare/artifact-fs/blob/main/examples... - it's kind of a generic image that turns into "another image" based on an argument you pass in. I recently built something similar for my own project that does the same thing except for Dockerfiles (needed for my stuff because cloud machines don't understand my registry!).I wonder if I can fork/extend ArtifactFS for other types of content addressed storage. My registry is very git-like in some ways - it has an index, files are content addressed, etc.
crabbone: > Agents have changed how we think about source control, file systems, and persisting state. Developers and agents are generating more code than ever — more code will be written over the next 5 years than in all of programming history — and it’s driven an order-of-magnitude change in the scale of the systems needed to meet this demand. Source control platforms are especially struggling here: they were built to meet the needs of humans, not a 10x change in volume driven by agents who never sleep, can work on several issues at once, and never tire.I keep hearing this argument, but there's not even an attempt at explanation for why this should be true.The amount of code written is predicated on the amount of features planned, which is in turn predicated on the customer's needs and willingness to pay. The amount of code a programmer is able to produce per day is not (and hasn't been for a while, don't know if ever) a problem when it comes to the speed of product development.Having witnessed some projects from early start to transition into maintenance mode, I could attest to the amount of code generated by the same programmers during different project maturity stages being dramatically different: at the very beginning, it's possible that a single programmer will be doing hundreds commits a day, each worth of hundreds of changes. But, once the project is mostly fleshed out, the commits start coming maybe once a day, but could be even fewer. The initial stage doesn't last very long either. It's typically measured in months.So, sorry... I don't think that agent changed any of "source control, filesystems, and persistent state". There's no reason and no evidence to believe that they did.
floating-io: I would be much more interested if they fully open sourced an S3-compatible git backend that anybody could use with any S3-compatible service.Does such a thing exist? Hmmm...
uroni: Yup, https://github.com/awslabs/git-remote-s3 (disclaimer: never used it)
nulltrace: Yeah pricing seems okay with batching. The 128MB memory cap per Durable Object is what I'd watch. A repo with a few thousand files and some history could hit that faster than you'd expect, especially during delta resolution on push.
adzm: Is it possible to use this through WinFsp? Or are there other Windows FUSE implementations?
slipknotfan: Or use a better operating systemhttps://www.over-yonder.net/~fullermd/rants/winstupid/1
gavinray: After reading the article, it still was not clear to me why I would need this over git, Github, and a bunch of branches/worktrees.On further analysis, it seems that the argument is "it's expensive and rate-limited to create thousands of repos/branches on Github" and "We built ArtifactFS as a lazy filesystem that hydrates files on-read, for big repos."So if those fit your usecases seems like a nice tool, but I'm imagining the market is niche here.
Aperocky: Github is not git, git is far more robust and can be really anywhere.
est: > bootstrap an Artifacts repo from an existing git repositorywow that's cool. I used to hack CF Worker to operate .git using isomorphic-git, it's a PITA.> ArtifactFS runs a blobless clone of a git repository: it fetches the file tree and refs, but not the file contents. It can do that during sandbox startup, which then allows your agent harness to get to work.That's insanely useful. How combining with only committing the file changed we'd have a single-blob editing capability against any .git
marxism: Run.I've used it in a product for a couple of thousand repos. The big problem is architectural. Each branch is serialized into a single bundle file. There is no structural sharing of git objects. So each and every branch will download the full history from scratch. So changing branches is as expensive as a fresh clone. If you combine this with a real user's desire for images/diagrams of any kind, then boom, massive slowness.There are also two concurrency bugs which the maintainer refuses to acknowledge.
aa6my: is it possible to use it for gitlab artifacts ?
mattzcarey: we do alot of work to optimize this on our side :)
sudb: As cool as this technically is - who is the target market for this? I think people building coding agents and coding agent platforms are for the most part building on non-Cloudflare sandboxes, and can tolerate minutes of latency for setup.I am not sure what people who roll their own in-house solutions for coding agents do, but I suspect that the easy path is still one of the many sandbox providers + GitHub.I would love to find out who would use this & why!
mattzcarey: lets just say with Artifacts you could create millions of repos every day, one for each agent/chat/user/session.Its all durable objects :)
mattzcarey: Hey, one of the authors here. I agree with most of the above. The aim for us is to build scalable version control for agents. As described in the blog the data lives on a combination of relational db and object storage backends.We started with exposing a git server interface from Artifacts but that is by no means the end goal or the best protocol for all situations. We are mega aware of that and now have feature parity to experiment with more a (dare I say) "agent first" VCS.