Page 2 of 5 FirstFirst 1234 ... LastLast
Results 11 to 20 of 41

Thread: KDE Almost Lost All Of Their Git Repositories

  1. #11
    Join Date
    Dec 2011
    Posts
    145

    Default

    http://jefferai.org/2013/03/24/screw-the-mirrors/ another update (talks about backup strategies in place), should also have been linked from the original article.

    Oh, and reading those blog posts isn't enough, UNDERSTANDING them is important (I didn't).

  2. #12
    Join Date
    Oct 2008
    Posts
    3,101

    Default

    Quote Originally Posted by ryao View Post
    Unfortunately, proper backups would not have helped as much as one would think without a way to know whether or not the data was bad prior to doing them. It is not clear when the repositories became corrupted, although the blog post suggests that fsck was the trigger.
    Proper backups would have been a Plan B, in case all the mirroring that they were doing got corrupted. Yes, the backups might have been corrupted as well, for the last month. That still would have given them a proper backup from a month ago, and then they could have just updated the files from the latest tarballs and just lost the last month's git history.

    That would have been pretty bad - but not as bad as losing the entire git history.

    Anyway, the sysadmin whose blog was linked discussed zfs, and wants to use it.

  3. #13

    Default

    Quote Originally Posted by smitty3268 View Post
    Proper backups would have been a Plan B, in case all the mirroring that they were doing got corrupted. Yes, the backups might have been corrupted as well, for the last month. That still would have given them a proper backup from a month ago, and then they could have just updated the files from the latest tarballs and just lost the last month's git history.

    That would have been pretty bad - but not as bad as losing the entire git history.

    Anyway, the sysadmin whose blog was linked discussed zfs, and wants to use it.
    What if the corruption occurred 2 months ago? Honestly, there is usually no generic way to tell if a traditional backup is sane. If you do have a way to tell that a backup is sane and you need to go back months to fix things, then you really are losing quite a bit. Just like ZFS is not a substitute for backups, backups are not a substitute for early detection. With that said, I question why anyone who has an integrity check that could detect bad backups would not run it at the time of the backup.

    Also, I just saw that Jeff Mitchell plans to deploy ZFS to prevent a recurrence of this. That is cool.
    Last edited by ryao; 03-25-2013 at 03:17 PM.

  4. #14
    Join Date
    Oct 2008
    Posts
    3,101

    Default

    Quote Originally Posted by ryao View Post
    What if the corruption occurred 2 months ago?
    Which is why your backups should go back farther. Use daily backups for a couple weeks, then weekly beyond that, then monthly, and so on. It doesn't take that much backup space to keep a yearly backup as far as when the project started. You just have to throw away the intervening backups at sane intervals.

    Honestly, there is usually no generic way to tell if a traditional backup is sane. If you do have a way to tell that a backup is sane and you need to go back months to fix things, then you really are losing quite a bit.
    Indeed. But less than if you don't have any backups at all. It is simply a mitigation strategy, to be used if all else has failed.

    Just like ZFS is not a substitute for backups, backups are not a substitute for early detection. With that said, I question why anyone who has an integrity check that could detect bad backups would not run it at the time of the backup.
    Yes. Integrity checks can potentially take too long to run at every backup point (depending on the data and software/hardware available), but you should at least run them occasionally. Once a week on those backups, if you can't afford to do it more often.

    I guess the issue here is that they thought they were running integrity checks but it turned out that wasn't happening.
    Last edited by smitty3268; 03-25-2013 at 03:31 PM.

  5. #15
    Join Date
    Apr 2010
    Posts
    719

    Default

    Quote Originally Posted by ryao View Post
    Unfortunately, proper backups would not have helped as much as one would think without a way to know whether or not the data was bad prior to doing them. It is not clear when the repositories became corrupted, although the blog post suggests that fsck was the trigger.
    While true, it's still better than the current setup. Even if they'd been getting corruption for a few months before it was noticed, a good backup setup would still let them go back that far, and they can work out how to reconstruct subsequent activity on top of the restored repository. And that's better than the current setup, where they survived only out of pure luck, apparently in the belief that having redundancy from the mirror system is good enough. Wrong!

  6. #16

    Default

    Quote Originally Posted by Delgarde View Post
    While true, it's still better than the current setup. Even if they'd been getting corruption for a few months before it was noticed, a good backup setup would still let them go back that far, and they can work out how to reconstruct subsequent activity on top of the restored repository. And that's better than the current setup, where they survived only out of pure luck, apparently in the belief that having redundancy from the mirror system is good enough. Wrong!
    From what I have read, they actually do regular tarball backups of the repository. Still, losing days of commits is not fun. They were extremely lucky that one server had a glitch that kept it from updating from master during the window that it was serving corrupted updates. That being said, the entire incident would have been avoided had they put master on ZFS.

  7. #17
    Join Date
    Nov 2012
    Posts
    607

    Default

    Quote Originally Posted by birdie View Post
    Yeah, Linux and Open Source are shaky.

    In fact I had an ext4 corruption on a partition I mount RO daily and remount RW maybe once a week to write a file or two.
    This is bunch of some morons bullshit. Furthermore, Linux is the most stable and reliable OS (that matters, I don't care about some casio watch "operating systems"). Winblows just blows up. Linux will have btrfs and Linux can use ZFS as well, while you can only dream about them on Windows.
    Last edited by Pawlerson; 03-25-2013 at 06:10 PM.

  8. #18

    Default

    Quote Originally Posted by Pawlerson View Post
    This is bunch of some morons bullshit. Furthermore, Linux is the most stable and reliable OS (that matters, I don't care about some casio watch "operating systems"). Winblows just blows up.
    Solaris is more reliable. This would not have happened had the master mirror been running a recent installation of Solaris.

  9. #19
    Join Date
    Nov 2012
    Posts
    607

    Default

    Quote Originally Posted by ryao View Post
    Solaris is more reliable. This would not have happened had the master mirror been running a recent installation of Solaris.
    You've got to be kidding me. It's a dead cow. Tell me why nearly nobody is using it? Btw. what are you doing for Gentoo? Last time you were trolling for bsd and now you're trolling for slowlaris. Get the facts till you write another bullshit next time:

    http://unixetc.co.uk/2012/01/22/zfs-...nlinked-files/
    https://forums.oracle.com/forums/thr...art=0&tstart=0 unreliable slowlaris and zfs (somebody should tell KDE devs to not use it).
    Last edited by Pawlerson; 03-25-2013 at 06:27 PM.

  10. #20
    Join Date
    Jan 2013
    Posts
    975

    Default

    The Punishment, Hodja Nasreddin

    Hodja told his son to go get some water from the well.
    Before the son left, Hodja slapped him and shouted, ''And make sure you don’t break the jug!''

    The boy began crying, and a bystander noticed this and said,
    ''Why did you hit him? He hasn’t done anything wrong.''

    Hodja replied, ''Well, better to hit him now
    than to hit him afterwards if he does end up breaking it. That would be too late.''




    I would ask Hodja Nasreddin to slap birdie and ryao right now.

    Twice.

    One time - for not acting before.
    Second time - for posting "wisdom from the manhole" when its too late.
    Last edited by brosis; 03-25-2013 at 06:23 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •