The Btrfs File-System Repair Tool Is Available
Phoronix: The Btrfs File-System Repair Tool Is Available
After writing about Btrfs LZ4 compression support and that the Btrfs FSCK tool wasn't available, it turns out that there is the new Btrfs repair tool, but it's not widely known and it's not recommended to ever use it -- at least at this stage...
Be very careful
While some of the items merged are very handy and useful, anything that can break a filesystem should be used with extreme care in lab conditions. I've applied the new restriper and the new parser code to my btrfs-progs on a NAS system I've built, but the fsck I will not touch. I figure if I have a filesystem that's become corrupt at this stage in the BTRFS development cycle, they'll probably want to see it before I try to fix it so they can find out what happened and maybe fix a bug.
But as far as the time it's taken BTRFS to get to this point, I'm actually thinking it's right on track. These days filesystems have so many features internally that need to be designed, written and tested that it's a serious undertaking. Add on top of that all of the externally-facing features modern operating systems expect a filesystem to have, and you've got a substantial project on your hands. Now, add in the fact that you're developing a way to store and retrieve data that could be critical and/or expensive and you've raised the bar well beyond where it's ever been before.
When the ext2 filesystem was first written, there weren't as many interfaces, drives were considerably slower, caching and scheduling systems were not nearly as complex or intelligent as they are now, and let's face it- Linux wasn't very popular. Now with things like SSDs becoming ever more common place, massive hardware support, bad behaviors being worked around, and the features an enterprise like Oracle will want...I wouldn't want to be starting a new filesystem project.
Anyway, hopefully after Oracle ships a distro with BTRFS as the primary filesystem we will see a large wave of adoption and code maturity (not to say the code isn't mature...but more users = more corner cases). I'd love to have an in-kernel filesystem capable of closing the feature gap on ZFS!
P.S. The community on the BTRFS mailing list is probably the most approachable and friendly community I've seen in a very long time!
I will not touch. I figure if I have a filesystem that's become corrupt at this stage in the BTRFS development cycle
There is one thing I don't understand with all this "changes the filesystem so can break it" discussion leading to not releasing code for a repair tool.
Why the hell didn't they just wrap it in a command line utility that first dd's the filesystem to be prepared to some image and does the repair work there? Then it's absolutely safe because it doesn't touch the original filesystem and you can release early and often with the only limitation being that the user needs to have access to some large filesystem to store the image, which in some cases (with really large filesystems) is a problem but generally it's quite reasonable to assume that it's possible to get hands on some large external hard drive for many systems.
So that's also what I'd do with a broken Btrfs first dd it to some image and then use these tools, then loop mount it and be happy if it works while having lost nothing but a few hours of work if it doesn't
Does anyone know where I can download this functional btrfsck tool? I'm not sure if the git repos listed on btrfs wiki site are current or which repo I should pull from. Thanks.
Their Wiki is out of date since the kernel.org outage.
Originally Posted by tux9656
This would only work in the case where BTRFS is on a single drive. One of the biggest advantages of BTRFS is that you can create a single filesystem on multiple devices. So, you'd have to DD several devices, change the metadata inside the BTRFS filesystem so it knows what devices to look for (now loop files essentially), and then fsck them. Once that's done, you'd have to do the reverse to get the data back.
Originally Posted by Spacenick
Assuming you have a simple RAID10-like setup with four drives of 1TB, you'd have to DD 4 TB of data twice. Even on 6G SAS / SATA that would take an absolute minimum of 3.7 hours at full channel speed. There are no 1TB drives that can actually get near this speed, so it's an impossible goal. But it proves the point.
Finally, the point of an FSCK tool is to be vary fast. The assumption is that a server providing a critical service is down, and you have uptime guarantees to meet. So, taking the time to DD data off a server simply isn't acceptable. For a home computer, maybe. But BTRFS really shines in an enterprise environment, so their goals aren't really focusing on a desktop/laptop situation.