I would never use this after being burned badly. Duplicity hits a scalability br...

rsync · on Jan 27, 2022

"I would never use this after being burned badly. Duplicity hits a scalability brick wall on large file volumes which consumes ridiculous amounts of CPU and RAM on the host machine and leads to irrecoverable failure where it can’t backup or restore anything."

I believe you are correct and I believe that in my private correspondence with the duplicity maintainer (we sometimes sponsor duplicity development[1]) he sort of conceded that borg backup[2] is a better solution.

If the cloud platform you point your borg backups to can configure immutable snapshots (that is, they create, rotate, and destroy them) then a good solution would be using borg backup over SSH and configuring some of those snapshots[3].

[1] https://www.rsync.net/resources/notices/2007cb.html

[2] https://www.stavros.io/posts/holy-grail-backups/

[3] https://twitter.com/rsyncnet/status/1453044746213990405

czl · on Jan 27, 2022

Assuming zfs is reasonable for the usecase: incremental zfs snapshots are likely efficient since these are byte level vs file level.

hughrr · on Jan 27, 2022

Depends on the recovery cost but yes I agree they are probably a better solution.

nonbirithm · on Jan 28, 2022

Duplicity also can't be run natively on Windows, so I was stuck having to migrate all of my backups to another program (Duplicati) once I changed operating systems.

alanfranz · on Jan 28, 2022

Do you have an idea of how "large" it is? I used it for a previous server of mine (about 250GB/300GB), I did daily backups and multiple restores over the years, they would always succeed, even though, admittedly, I didn't care about speed.

jcrawfordor · on Jan 28, 2022

I found Duplicity to be untenable for a volume of about 2TB on a machine with relatively low resources (4GB of RAM). It would consistently fail due to OOM before completing the job. This is an unfortunately common issue with backup tools, which it seems get surprisingly little testing on multi-TB tasks. I'd guesstimate maybe half of open source backup tools I've tried either seriously struggle or reliably fail to complete an 8TB job I have on another machine, with similar memory constraints.

So far Rustic has been my best bet. It seems to complete very large jobs with static and relatively low memory consumption. Enumeration takes about 30 minutes and a full backup nearly a month on the 8TB job (limited bandwidth available) but it reliably completes. Rustic also resumes interrupted jobs (e.g. due to reboot) in around 10-15 minutes on the 8TB volume and with it seems minimal rework. I'm sure there are other tools that handle this as well, but I've definitely gotten frustrated with finding them. I wish more backup tools gave you some kind of assurance in the marketing materials that they've been validated on, say, 10TB.

alanfranz · on Jan 28, 2022

> I wish more backup tools gave you some kind of assurance in the marketing materials that they've been validated on, say, 10TB.

I'd wish that too. Also, I'd like to pay for a backup tool, since it's so critical, so that I can get a sort of support, but I have found issues with many of them.

I must say that, with my small population (N=1) for testing, it was hard for me to settle on a backup system. I tried duplicity, then I tried borg, then I used duplicati. I had considered attic and restic as well, but I don't remember why I didn't choose them right now.

I experienced issues with most of them and I reverted to duplicity. With borg, I had a persistent issues where backups would stop and said something like "backup destination is newer than source" (I forgot the exact message, but happened multiple times across versions). Duplicati seemed a very large and complex codebase, and periodically stopped working, mostly because dotnet runtime issues AFAICR.