Discussion
Running out of Disk Space in Production
flanfly: A neat trick I was told is to always have ballast files on your systems. Just a few GiB of zeros that you can delete in cases like this. This won't fix the problem, but will buy you time and free space for stuff like lock files so you can get a working system.
jaapz: Love the simplicity and pragmatism of this solution
omarqureshi: Surely a 50% warning alarm on disk usage covers this without manual intervention?
jcims: If the alarms are reliably configured, confirmed to be working, low noise enough to be actioned, etc etc.And of course there's nothing to say that both of these things can't be done simultaneously.
ninalanyon: This is why I never empty the Rubbish Bin/trash Can on my Linux laptop until the disk fills.
dspillett: If the alarm works. And it actioned not just snoozed too much or just dismissed entirely.Defence in depth is a good idea: proper alarms, and a secondary measure in case they don't have the intended effect.
theshrike79: Depends. A Kubernetes container might have only a few megabytes of disk space, because it shouldn't need it.Except that one time when .NET decides that the incoming POST is over some magic limit and it doesn't do the processing in-memory like before, but instead has to write it to disk, crashing the whole pod. Fun times.Also my Unraid NAS has two drives in "WARNING! 98% USED" alert state. One has 200GB of free space, the other 330GB. Percentages in integers don't work when the starting number is too big :)
Chaosvex: Similar to the old game development trick of hiding some memory away and then freeing it up near the end of development when the budget starts getting tight.
throw0101d: > A neat trick I was told is to always have ballast files on your systems.ZFS has a "reservation" mechanism that's handy:> The minimum amount of space guaranteed to a dataset, not including its descendants. When the amount of space used is below this value, the dataset is treated as if it were taking up the amount of space specified by refreservation. The refreservation reservation is accounted for in the parent datasets' space used, and counts against the parent datasets' quotas and reservations.* https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops...Quotas prevent users/groups/directories (ZFS datasets) from using too much space, but reservations ensure that particular areas always have a minimum amount set aside for them.
entropie: > I rushed to run du -sh on everything I could, as that’s as good as I could manage.I recently came across gdu (1) and have installed/used it on every machine since then.[1]: https://github.com/dundee/gdu
Neil44: I also discovered gdu recently. It's really good. It saves me running du -h --max-depth=1 | sort -h a million times trying to find where the space has gone while you're stressing about production being down.
fifilura: I did this too, but i also zipped the file, turns out it had great packing ratio!
saagarjha: Personally I just keep the file on a ramdisk so you can avoid having to fetch it from slow storage
3form: Neat! I optimized for my own case, and I'm storing my ramdisk on SSD to gain persistence.