Results 1 to 9 of 9

Thread: SDFS: A File-System With Inline De-Duplication

  1. #1
    Join Date
    Jan 2007
    Posts
    14,317

    Default SDFS: A File-System With Inline De-Duplication

    Phoronix: SDFS: A File-System With Inline De-Duplication

    ZFS is known for its de-duplication support and there are other file-systems (such as Dragonfly's HAMMER, plus work-in-progress support for Btrfs) that support this data compression feature of eliminating duplicate data. There's also a new project that we have just learned about which is SDFS, a file-system that offers inline de-duplication support...

    http://www.phoronix.com/vr.php?view=OTgwNQ

  2. #2
    Join Date
    Jul 2009
    Posts
    219

    Default

    Does this make SDFS the first (apparently) stable FS with de-dup support on linux? Be interesting to hear more, though it doesn't appear to be new...

  3. #3
    Join Date
    Oct 2007
    Posts
    284

    Default

    Quote Originally Posted by Cyborg16 View Post
    Does this make SDFS the first (apparently) stable FS with de-dup support on linux? Be interesting to hear more, though it doesn't appear to be new...
    It's written in JAVA thus i wonder how fast/stable it can possible be... That is the biggest issue for me.
    I am interested to see new banchmarks aimed at the features of this FS compared to ZFS, BTRFS and those other filesystems that have deduplication support.

  4. #4
    Join Date
    Jul 2008
    Location
    Greece
    Posts
    3,776

    Default

    Benchmark the sucker :-)

  5. #5
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    4,987

    Default

    Hey, squashfs has had dedup ever since it started

  6. #6
    Join Date
    Nov 2007
    Posts
    1,024

    Default

    Quote Originally Posted by markg85 View Post
    It's written in JAVA thus i wonder how fast/stable it can possible be... That is the biggest issue for me.
    Sigh. Java is quite fast, all things considered. All modern implementations compile to machine code, either AOT using something like GCJ or at runtime via a JIT compiler. An loop over some math code in Java should be just as efficient as the exact same loop in C, assuming everything else is equal. For applications that are heavily I/O bound, the difference in language speed is almost completely negligible; it could be written in BASIC for all anyone should care. The rest of the speed is going to be based on algorithms, which are again completely neutral to the implementation language of choice.

    The reputation Java has for being slow (and you are basing your fears off reputation, not any kind of in-depth recent experience, that is obvious) is based on two things: the poor performance of 15 year old JVMs such as the one Microsoft still ships, and the god-awful design and implementation of the AWT and Swing UI toolkits (anyone clued in these days uses SWT instead).

  7. #7
    Join Date
    Oct 2007
    Location
    Under the bridge
    Posts
    2,126

    Default

    It runs over FUSE so Java won't be a bottleneck. As for stability, when was the last time you saw a filesystem driver crash? The language doesn't really matter, because any problems should be resolved before the thing goes live.

  8. #8
    Join Date
    Aug 2009
    Posts
    157

    Default

    I'm liking ZFS on linux more... Even if it takes adding a PPA or compiling.

  9. #9
    Join Date
    Jul 2009
    Posts
    219

    Default

    Quote Originally Posted by BlackStar View Post
    It runs over FUSE so Java won't be a bottleneck. As for stability, when was the last time you saw a filesystem driver crash? The language doesn't really matter, because any problems should be resolved before the thing goes live.
    Filesystem driver crash? Who's worried about that? It's avoiding corrupt data on disk you should be worried about with a new filesystem. I'm wondering about employing this or ZFS for a backup disk which could be a bit risky, but in theory it's a redundant backup anyway (the really important stuff gets backed up several ways).

    ZFS generally looks good, but I don't think it's a panacea either in terms of data security or performance. Compiling a kernel module isn't too difficult; I hadn't realised they had a fully working driver already. Does anyone know much about the maturity of either driver?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •