Results 1 to 9 of 9

Thread: ZFS Still Trying To Compete With EXT4 & Btrfs On Linux

  1. #1
    Join Date
    Jan 2007
    Posts
    14,822

    Default ZFS Still Trying To Compete With EXT4 & Btrfs On Linux

    Phoronix: ZFS Still Trying To Compete With EXT4 & Btrfs On Linux

    With the recent release of ZFS On Linux 0.6.2 that provides an open-source native Linux kernel module implementation of the Sun/Oracle ZFS file-system, the performance is faster, there are greater Linux kernel compatibility, and other improvements. Here's a fresh round of ZFS Linux benchmarks against EXT4 and Btrfs.

    http://www.phoronix.com/vr.php?view=19059

  2. #2
    Join Date
    Jan 2012
    Posts
    59

    Default

    I've asked when other ZFS benchmarks have been posted if the pool was created with "ashift=12" for backing devices that are optimized for 4k writes, or even better, "ashift=13" for 8k on modern SSDs (basically everything these days, though I know the 510 advertises physical 512 sectors for whatever reason, it should still do significantly better with 4k or 8k). I never find out what the answer is, but it's probably 'no', which means these tests are kind of pointless.
    Last edited by pdffs; 08-27-2013 at 04:30 AM.

  3. #3
    Join Date
    Jul 2009
    Posts
    221

    Default

    Meh. Would be much more interesting if you'd explore some of ZFS's and BTRFS's other features (e.g. snapshots and effectiveness of the journal when deliberately trying to introduce errors).

  4. #4
    Join Date
    Aug 2013
    Posts
    80

    Default

    Personally I've been waiting for a benchmark comparing ZFS and Btrfs using different compression algorithms, primarily LZ4.
    The package I'm currently using says it supports "lzjb | gzip | gzip-[1-9] | zle | lz4" (pulled from "zfs set").

  5. #5

    Default

    Quote Originally Posted by pdffs View Post
    I've asked when other ZFS benchmarks have been posted if the pool was created with "ashift=12" for backing devices that are optimized for 4k writes, or even better, "ashift=13" for 8k on modern SSDs (basically everything these days, though I know the 510 advertises physical 512 sectors for whatever reason, it should still do significantly better with 4k or 8k). I never find out what the answer is, but it's probably 'no', which means these tests are kind of pointless.
    Phoronix refuses to adjust ashift itself. However, ZFSOnLinux added a drive database that will automatically do this for known drives. It is incomplete, but it will grow as people send me information on drives that are missing. Instructions for those who wish to contribute are available on the mailing list. Note that the link to the database is outdated. The current database is visible in the repository.

    https://groups.google.com/a/zfsonlin...g/qCygxkVWam4J
    https://github.com/zfsonlinux/zfs/bl...ol_vdev.c#L108

    In this case, the drive was on the list, which mean that the benchmarks were done with ashift=13. This is what enabled ZFSOnLinux to go from underperforming ext4 in the IOMeter file server benchmark to outperforming it significantly. With that said, it is not clear to me how partitioning was done. ZFS would be somewhat handicapped (although not by much) if partitioning was done for it versus it doing its own partitioning. This is because the Linux elevator is redundant and is set to noop when ZFS has full control of the disk. Anyway, I have a few comments on each of the benchmarks:

    1. Good benchmarking is hard and it is easy to do benchmarks that provide irrelevant results. I can usually find issues with the design of Phoronix's benchmarks, but in the case of IOMeter, I have not found anything wrong yet. Incidentally, ZFS does well here. That is likely because of a mix of ARC and ZIL.

    2. The FS-Mark benchmarks tested the creation of 1MB files. This is a purely synthetic benchmark that does not match any workload anyone would do and does not matter to me much. If anyone has a real workload that does this, please let me know so I can start caring. Of some interest is how the filesystems scaled from 1 to 4 threads. ZFS had a 3% increase while btrfs and ext4 had 82% and 59% increases respectively. It is probably worth investigation why throughput did not increase quite so much. There is an Illumos patch to ZFS' internal IO elevator that might help with this. It will likely be merged in 0.6.3.

    3. Phoronix did not appear to use DBench as it was intended to be used. It is supposed to use a load file that simulates a specific application, but there is no information about that. Being designed to test network filesystems, it is really useful when data points at different client counts are taken, but Phoronix only tested 1 client. With that said, I am okay with how ZFS performed versus other filesystems. The numbers here do not matter to me much.

    4. Compile Bench is a fairly useless benchmark because compilation is not IO bound, yet this appears to run the IO workload without doing any real compilation. It is unlikely that a real build process will exceed more than a few megabytes per second, which basically any filesystem can handle. Despite that, it is interesting that ext4 managed to outperform the interface bandwidth of SATA III. The peak bandwidth of SATA III is approximately 600MB/sec, but ext4 managed 726MB/sec. This suggests that writes are being buffered. It is possible to get the same effect with ZFS by using a dedicated dataset and setting sync=disabled. This is what I do for builds on my computer. However, it does not make much of a difference because compilation is CPU-bound and not IO-bound.

    5. Postmark has a few interesting irregularities. The first is that it is absent in previous Phoronix benchmarks. I noticed this when I went to look at ZFS' relative performance to ext4 and others so that I could see how using a proper ashift changed things. Another is that the standard error calculation for ext4 and btrfs is both 0. This suggests that ext4 and btrfs were not writing to disk. This benchmark was intended to measure mailserver IO performance, but it does a remarkably poor job of that. First, it is single-threaded and second, it does not call fsync(). Good mail server software should should call fsync before reporting delivery to ensure data integrity, but that does not happen here. Mail server software intended to scale should be multithreaded, but this benchmark is single threaded. This writes about 500 small files that in total are less than 5MB, which the kernel has no reason to flush to disk. In the case of ZFS, the non-zero standard error suggests that data is being written out. If a crash occurred during this benchmark, the simulated mail would be lost on ext4 and btrfs while ZFS would have managed to save at least some of it. Doing better here means increased data loss in the event of a crash, which does not interest me very much.
    Last edited by ryao; 08-27-2013 at 02:00 PM.

  6. #6
    Join Date
    Jan 2013
    Posts
    1,458

    Default

    Quote Originally Posted by ryao View Post
    Phoronix refuses to adjust ashift itself. However, ZFSOnLinux added a drive database that will automatically do this for known drives. It is incomplete, but it will grow as people send me information on drives that are missing. Instructions for those who wish to contribute are available on the mailing list. Note that the link to the database is outdated. The current database is visible in the repository.
    Did you do all this just to win on phoronix benchmarks?

  7. #7
    Join Date
    Dec 2008
    Posts
    146

    Default Isn't Intel SSDSC2CW12 an Intel 520 SSD, not an Intel 510 SSD?

    Isn't Intel SSDSC2CW12 an Intel 520 SSD, not an Intel 510 SSD?

    ark.intel.com does not find a match when searching for SSDSC2CW12, but googling it does bring up a few mentions of the Intel 520 SSD.

    So which is it? Intel 510 SSD or Intel 520 SSD?

    It is a mistake to use any Sandforce SSD for phoronix benchmarks, since many of the phoronix benchmarks write streams of zeros that are unrealistically easy for the Sandforce controller to compress.

  8. #8
    Join Date
    Aug 2012
    Posts
    13

    Default More Drives, and Data Reliability

    I would live to see Tests on ZFS done with at least 4 Drives using RaidZ or RaidZ2 modes, since that to me is the strongest feature of ZFS along with some Data reliability Tests, compared to BTRFS, and EXT4.

  9. #9
    Join Date
    Jan 2012
    Posts
    59

    Default

    Quote Originally Posted by ryao View Post
    Phoronix refuses to adjust ashift itself. However, ZFSOnLinux added a drive database that will automatically do this for known drives. It is incomplete, but it will grow as people send me information on drives that are missing. Instructions for those who wish to contribute are available on the mailing list. Note that the link to the database is outdated. The current database is visible in the repository.

    https://groups.google.com/a/zfsonlin...g/qCygxkVWam4J
    https://github.com/zfsonlinux/zfs/bl...ol_vdev.c#L108

    In this case, the drive was on the list, which mean that the benchmarks were done with ashift=13. This is what enabled ZFSOnLinux to go from underperforming ext4 in the IOMeter file server benchmark to outperforming it significantly. With that said, it is not clear to me how partitioning was done. ZFS would be somewhat handicapped (although not by much) if partitioning was done for it versus it doing its own partitioning. This is because the Linux elevator is redundant and is set to noop when ZFS has full control of the disk. Anyway, I have a few comments on each of the benchmarks:
    Yeah, I saw your other thread about the 0.6.2 release after I posted here. Are you sure the Intel 510 is in your list? It doesn't appear to be to me. As I posted in the other thread:

    Quote Originally Posted by pdffs View Post
    You might want to check the FreeBSD 4k quirks (ADA_Q_4K) list from ata_da.c to boost your list.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •