Discussion
gwbas1c: I was lead on Syncplicity's desktop client. File synchronization has a myriad of corner cases that are difficult and non-intuitive to think through; and non-programmers often thoroughly underestimate just how difficult these are to anticipate and mitigate.The fact that they found bugs that rely on sensitive timing doesn't surprise me.
ai_slop_hater: Can you share which difficult and non-intuitive corner cases there are? I guess debouncing, etc.
gwbas1c: The way I used to explain it:Imagine that you are on a plane, (and don't have an internet connection). You edit a file.At the same time, I edit that file.What should we do? We can't possibly know every file format out there, and implement operational transform for all of them.Now, imagine that we both edit the same file, at the same instant. One of us is going to submit the change first, and the other will submit it second. It's the same use case, and there's no way to avoid this.---Renaming folders was a lot weirder, because you get situations like:I rename a folder, but you save a new change to a file in that renamed folder and your computer doesn't know about the renamed folder.Or, I rename a folder and you have a file open. That application has an open file handle to that file, so we can't just rename the folder. What do we do? (This is how Excel does it.)Or, I rename a folder and you have a file open, but that application doesn't have an open file handle to that file. What happens when you try to save the file and it's been moved? (This is how most applications do it.)---Application bundles (on Mac) were weird because we didn't support the metadata needed to sync them.---The general "Merge" use case, which had to do with the fact that Syncplicity could sync folders anywhere on disk. (As opposed to the way That Dropbox, Google Drive, and OneDrive stick everything into a single folder.) We'd have customers disconnect a folder, and then re-add it to the same location. The problem was if they were disconnected for a long time, they would "merge" the old version of the folder into the new one:If you edited a file while disconnected, it hit the same "multiple editors" use case that I mentioned above.If someone deleted a file but you still had it, we'd recreate it. (We can't read minds, you know!)If someone renamed a folder, but you still had the old path, we'd re-add it.I remember overhearing non-programmer product managers trying to talk through these use cases and just getting overwhelmed with the complexity and realizing they were deep, deep over their heads.---A lot of these corner cases were smoothed over when we wrote "SyncDrive", which was a virtual disk drive, because all of the IO came through us. (Instead of scanning a folder to understand what the user did.)
Geonode: Business idea- a file sync software run by a company that promises to fire any employee who suggests adding a "feature."
d--b: oh and the parent folder is on a shared NAS with some caching.
gwbas1c: We had to add logic to block network and USB drives. (They were an ever-present source of customer issues.)The root cause of the problem is that in .net, there is a bug with File.Exists. If there is a filesystem / network error, instead of getting an exception, the error is swallowed and the call just returns false. I'm not sure if newer versions of .net fix it or not; I only learned about this when we were implementing a driver / filesystem.
JackeJR: There was a discussion of a self-built dropbox on the frontpage (https://news.ycombinator.com/item?id=47673394). This is just to show that dropbox is thoroughly tested for all kinds of wierd interactions and behaviours across OS using a very formal testing framework.
steveBK123: This is the kind of thing I think about when i see the mindset of “we’ll just replace all the SaaS with vibe code” pitches.Not everything is a CRUD app website.I was running my own hacky sync thing to the cloud a decade ago. I would never in my boldest dreams compared it to dropbox.Even if you know the use cases, the edge cases could be 99% of the work. POCs are 100x easier than working production multi-user applications. Don’t confuse getting to a POC in 2 hours with getting a final product in 4 hours.
steveBK123: I used a paid SaaS sync service 10 years ago (not Dropbox) that had the following failure mode even though it had been around for a few years..You could have it mirror an entire subdirectory, including external drives.If you booted up long enough and that external drive was not mounted, the service registered that as a subdirectory delete (bad). When you then mounted it again, the sync agent saw it as out of sync with the newer server-side delete and proceeded to clear the local external drives.They also implemented versioning so poorly that a deleted directory was not versioned, only the files within it. So you could recover raw files without the directory structure back in a giant bundle of 1000s of files. Horrible.See: https://dynamicsgpland.blogspot.com/2011/11/one-significant-...