I know it's not a guarantee of no-corruption, and ZFS without ECC is probably no more dangerous than any other file system without ECC, but if data corruption is a major concern for you, and you're building out a pretty hefty system like this, I can't imagine not using ECC.
Slow on-disk data corruption resulting from gradual and near-silent RAM failures may be like doing regular 3-2-1 backups -- you either mitigate against the problem because you've been stung previously, or you're in that blissful pre-sting phase of your life.
EDIT: I found TFA's link to the original build out - and happily they are in fact running a Xeon with ECC. Surprisingly it's a 16GB box (I thought ZFS was much hungrier on the RAM : disk ratio.) Obviously it hasn't helped for physical disk failures, but the success of the storage array owes a lot to this component.
The system is using ECC and I specifically - unrelated to ZFS - wanted to use ECC memory to reduce risk of data/fs corruption. I've also added 'ecc' to the original blog post to clarify.
Edit: ZFS for home usage doesn't need a ton of RAM as far as I've learned. There is the 1 GB of RAM per 1TB of storage rule of thumb, but that was for a specific context. Maybe the ill-fated data deduplication feature, or was it just to sustain performance?
Thanks, and all good - it was my fault for not following the link in this story to your post about the actual build, before starting on my mini-rant.
I'd heard the original ZFS memory estimations were somewhat exuberant, and recommendations had come down a lot since the early days, but I'd imagine given your usage pattern - powered on periodically - a performance hit for whatever operations you're doing during that time wouldn't be problematic.
I used to use mdadm for software RAID, but for several years now my home boxes are all hardware RAID. LVM2 provides the other features I need, so I haven't really ever explored zfs as a replacement for both - though everyone I know that uses it, loves it.
It's difficult as a home user to find ECC memory, harder to make sure it actually works in your hardware configuration, and near-impossible to find ECC memory that doesn't require lower speeds than what you can get for $50 on amazon.
I would very much like to put ECC memory in my home server, but I couldn't figure it out this generation. After four hours I decided I had better things to do with my time.
Indeed. I'd started to add an aside to the effect of 'ten years ago it was probably easier to go ECC'. I'll add it here instead.
A decade ago if you wanted ECC your choice was basically Xeon, and all() Xeon motherboards would accept ECC.
I agree that these days it's much more complex, since you are ineluctably going get sucked into the despair-spiral of trying to work out what combination of Ryzen + motherboard + ECC RAM will give you actual, demonstrable* ECC (with correction, not just detection).
Sounds like the answer is to just buy another Xeon then, even if it's a little older and maybe secondhand. I think there's a reason the vast majority of Supermicro motherboards are still just Intel only.
Accidentally unplugged my raid 5 array and thought I damaged the raid card. Hours after boot I’d get problems. I glitched a RAM chip and the array was picking it up as disk corruption.
I know it's not a guarantee of no-corruption, and ZFS without ECC is probably no more dangerous than any other file system without ECC, but if data corruption is a major concern for you, and you're building out a pretty hefty system like this, I can't imagine not using ECC.
Slow on-disk data corruption resulting from gradual and near-silent RAM failures may be like doing regular 3-2-1 backups -- you either mitigate against the problem because you've been stung previously, or you're in that blissful pre-sting phase of your life.
EDIT: I found TFA's link to the original build out - and happily they are in fact running a Xeon with ECC. Surprisingly it's a 16GB box (I thought ZFS was much hungrier on the RAM : disk ratio.) Obviously it hasn't helped for physical disk failures, but the success of the storage array owes a lot to this component.