PDA

View Full Version : Real World Benchmarks Of The EXT4 File-System


Pages : [1] 2

phoronix
12-03-2008, 08:20 AM
Phoronix: Real World Benchmarks Of The EXT4 File-System

With the EXT4 file-system being marked as stable in the forthcoming Linux 2.6.28 kernel, and some Linux distributions potentially switching to it as an interim step until the btrfs file-system is ready, we decided it was time to benchmark this journaled file-system for ourselves. We ran a number of disk-centric Linux benchmarks along with several of our real-world tests from the Phoronix Test Suite to gauge how well the EXT4 file-system performance will be noticed by desktop users and computer gamers. We have compared these EXT4 results to the EXT3, XFS, and ReiserFS file-systems.

http://www.phoronix.com/vr.php?view=13199

Kazade
12-03-2008, 08:29 AM
I'll be honest, I'm a little confused about using games as a benchmark for a filesystem. Games load resources from the disk before the game play starts, everything from that point on is stored in either RAM or VRAM while the game is in play (unless of course you run out of memory). Only an insane game developer would read or write from the disk during gameplay because it would kill frame rate.

If you were timing the loading times (or game saves) fair enough, but using the frame rate as a bench mark seems pointless.

StefanHamminga
12-03-2008, 08:57 AM
May I suggest a real world test that might show a difference?
Find a way to bench:
apt-get update && apt-get dist-upgrade

It seems ('feels') faster with EXT4, coming from EXT3 (running Intrepid on a HTPC).

Solitary
12-03-2008, 09:04 AM
Time elapsed from power on to login screen would have been interesting.

b0uncyfr0
12-03-2008, 09:08 AM
Yes im really interested in EXT4 but i need to know if theres an increase in boot speed. That all i mainly care for.

Also deleting large directories isnt a strongpoint of EXT4. Thats a pity. EXT3 seems slow as it is, i cant imagine it geting any slower.

Thanks for the review..

ssam
12-03-2008, 09:12 AM
it would be nice to see benchmarks for boot time and program load time. these are far more disk bound than the games and encoding tasks.

also i'd love to see error bars on these benchmarks (sorry i am a physicist). if you just give a result with 4 significant figures, you are implying that the result would be exactly the same if you ran it multiple times. i suspect that some of these results vary by a few percent with each run. so if one beats te other by 0.5% it is statistically insignificant.

all that would be needed would be to run each test 3 times, and put the standard deviation of the 3 runs as the error bar.

then if the difference in height between 2 bars is smaller than the error bars, it can be seen that it is insignificant.

http://www.graphpad.com/articles/errorbars.htm <- some info

(also if would be great if it were clearer which plots have big is good, and which are small is good. maybe an arrow labelled 'faster' pointing up or down)

Gentooer
12-03-2008, 09:20 AM
I second the suggestion for error bars. They would make the results much more meaningful. It would have also been nice to see Reiser4 instead of the outdated v3.

pjlbyrne
12-03-2008, 09:29 AM
I would like to have seen timings for a large code build - that it the kind of activity which would really benefit from aggressive file fragmentation prevention (AFFP?).

Michael
12-03-2008, 09:29 AM
it would be nice to see benchmarks for boot time and program load time. these are far more disk bound than the games and encoding tasks.

also i'd love to see error bars on these benchmarks (sorry i am a physicist). if you just give a result with 4 significant figures, you are implying that the result would be exactly the same if you ran it multiple times. i suspect that some of these results vary by a few percent with each run. so if one beats te other by 0.5% it is statistically insignificant.

all that would be needed would be to run each test 3 times, and put the standard deviation of the 3 runs as the error bar.

then if the difference in height between 2 bars is smaller than the error bars, it can be seen that it is insignificant.

http://www.graphpad.com/articles/errorbars.htm <- some info

(also if would be great if it were clearer which plots have big is good, and which are small is good. maybe an arrow labelled 'faster' pointing up or down)

Tests from the Phoronix test Suite are usually ran at least three times and then averaged all automatically.

Once I upload these results to Phoronix Global, if using Phoronix Test Suite 1.6 you can run: phoronix-test-suite analyze-all-runs and it will generate modified candlestick charts (essentially your error bars) on the results.

[Knuckles]
12-03-2008, 09:54 AM
It would be interesting to see tests with many small files. I once used XFS for my root filesystem and performance was horrible to do things like unpack sources and remove directories with many files, my system would just freeze while XFS was thrashing away at the disk.

It's all nice and dandy with multi-gb files, when all those caches and delayed allocations do a great job, but I learned my lesson with XFS, and now use ext3 everywhere.

I also agree that boot time and application load times are important, although I guess both may be hard to measure with the PTS.

gshayban
12-03-2008, 09:59 AM
Has anyone tested CPU usage? Lower CPU usage = less wakeups = better battery life, right?

deanjo
12-03-2008, 10:02 AM
;53993']It would be interesting to see tests with many small files. I once used XFS for my root filesystem and performance was horrible to do things like unpack sources and remove directories with many files, my system would just freeze while XFS was thrashing away at the disk.

It's all nice and dandy with multi-gb files, when all those caches and delayed allocations do a great job, but I learned my lesson with XFS, and now use ext3 everywhere.

I also agree that boot time and application load times are important, although I guess both may be hard to measure with the PTS.

XFS is easily tweaked to cure that.

Fixxer_Linux
12-03-2008, 10:27 AM
Hi,

Would we see any upgrade tool to migrate from ext3 to ext4, as like a kind of "partition magic" ?
I wouldn't have to refomat my drive only for migrating from ext3 to ext4.

kraftman
12-03-2008, 10:36 AM
Thanks Michael. Those tests seems to be far more objective then previous. The title is correct in my opinion :)

spinkham
12-03-2008, 11:34 AM
If this test is done again, could you include JFS also? JFS is a strong contender for best Linux filesystem at the moment.

mahuyar
12-03-2008, 11:36 AM
In the beginning sentence of the article, it is stated that distributions will be using EXT4 as an interim step.

Does that mean distributions like Ubuntu and Fedora desktop editions will be skipping the use of EXT4 altogether? I thought Btrfs is more suitable for database storage applications and such.

Please enlighten the poor soul.

drag
12-03-2008, 11:46 AM
Does that mean distributions like Ubuntu and Fedora desktop editions will be skipping the use of EXT4 altogether? I thought Btrfs is more suitable for database storage applications and such.

Ext4, even though it's a substantial incremental improvement, is still based on the well proven Ext3 file system. Some of the reason why Ext3 doesn't look good in a lot of benchmarks vs things like XFS and ReiserFS is because Ext3 is designed to be a robust file system for today's cheap and unreliable commodity-based servers and workstations.

Look at this way: Given what I know about Linux file systems, personal experiences, and second-hand accounts; I wouldn't want to run Reiser or XFS on a server without battery cache on the RAID card AND a UPS or on a laptop or desktop that is prone to errors like kernel oops and power fluctuations. Actually XFS is useful, but I'd never use Reiserfs for anything important. v4 is untested, and v3 is just flaky and unsupported.

So people will use Ext4 because it's based on proven code.

Btrfs, despite all it's advancements and new features, is going to take a long time to get acceptance for anybody doing anything very serious. You just can't trust a file system for anything critical that hasn't had a year or two of serious usage by a wide audience.

Unfortunately the advancements in disk capacity and the relatively lack of progress in increasing disks speeds have obsoleted Ext3. It now takes exponentially longer and longer to recover file systems and run a fsck. And Ext3 doesn't behave well when you get file systems upwards to several TBs. Ext4 will allow enterprises to use Linux as a storage OS until Btrfs is production ready.

Linux, right now, because of it's lack of very good logical volume management and the limitations of Ext3 isn't nearly as popular as it should be for the storage arena.

dben
12-03-2008, 11:48 AM
Most of those tests are kind of ridiculous. There's especially no point in testing a video game; any changes in fps would be negligible here. You load the data from disk once, it gets cached into memory, and that's the end of it. Maybe the occasional disk activity here and there with logs and a new data file or two, but it's nothing. Encryption, compression, encoding, etc., are CPU intensive operations. They're rather pointless too.k


Real-world benchmarks are great and all, but that doesn't mean you just pick a random program and time it. Find something IO-intensive if you don't want to run a purely IO-based benchmark. Boot times for example.

Also, without reporting the error, your numbers are misleading and dishonest. It's great that you've averaged three runs, but we need to know how consistent the runs were. I have a suspicion that the error in the games benchmarks far outweigh the differences between the filesystems, given how much disk io games do. (Virtually none. Disks accesses slow, and game developers know to avoid them like the plague.)

Be more careful in your conclusions and, if nothing else, give us reason to trust them.

mahuyar
12-03-2008, 12:05 PM
Ext4, even though it's a substantial incremental improvement, is still based on the well proven Ext3 file system. Some of the reason why Ext3 doesn't look good in a lot of benchmarks vs things like XFS and ReiserFS is because Ext3 is designed to be a robust file system for today's cheap and unreliable commodity-based servers and workstations.

Thanks for the explanation, Drag. I guess it seems Ext4 is more mature than Btrfs for the time being.

I would like to see the benchmarks of Btrfs in the near future. I believe it is at least ready for benchmarking.

petabyte
12-03-2008, 12:12 PM
I was hoping the Phoronix team would take recommendations when benchmarking file systems - I wrote a post about it more than once:
http://www.phoronix.com/forums/showthread.php?t=11575

I don't really understand the Phoronix process; is this purposefully a pseudo-technical publicity stunt?

I was just hoping that you guys would want to take on something more in depth... Just this once. One in depth article is way more useful to your readers than ten of whatever you call these.

Sorry for the harsh words and good luck.

deanjo
12-03-2008, 12:15 PM
Most of those tests are kind of ridiculous. There's especially no point in testing a video game; any changes in fps would be negligible here. You load the data from disk once, it gets cached into memory, and that's the end of it. Maybe the occasional disk activity here and there with logs and a new data file or two, but it's nothing. Encryption, compression, encoding, etc., are CPU intensive operations. They're rather pointless too.k


Real-world benchmarks are great and all, but that doesn't mean you just pick a random program and time it. Find something IO-intensive if you don't want to run a purely IO-based benchmark. Boot times for example.

Also, without reporting the error, your numbers are misleading and dishonest. It's great that you've averaged three runs, but we need to know how consistent the runs were. I have a suspicion that the error in the games benchmarks far outweigh the differences between the filesystems, given how much disk io games do. (Virtually none. Disks accesses slow, and game developers know to avoid them like the plague.)

Be more careful in your conclusions and, if nothing else, give us reason to trust them.

Your missing the point of the game benches. They are there to show that really doesn't effect applications such as games. As the article implies, it goes after real world use and if there is impact to those apps. In real world use, the fs really doesn't effect games. Granted a level load time would have been a bit more informative as too what could potentially made a bit of a difference.

Chewi
12-03-2008, 12:18 PM
If this test is done again, could you include JFS also? JFS is a strong contender for best Linux filesystem at the moment.

I am also a JFS fan. It doesn't seem to get so many column inches despite being very good - and I have tried EXT3, XFS, ReiserFS and Reiser4. Unlike XFS, you don't get that thrashing that [Knuckles] mentioned, though I am vaguely aware that XFS can be tweaked. Just never tried it.

zbraniecki
12-03-2008, 12:32 PM
So, I totally agree that testing games was pointless... And you can't defend it saying you wanted to test "real life app experience". The way file system influences experience is by map loading, app startup, etc. NOT by fps.

So, boot time please, gnome/kde launch time, file copying, "du -sh /", text search on files etc.

And, it would definitely get more attractive to test more filesystems. JFS, Reiser4, btrfs would add a lot of flavor to such article. :)

npcomplete
12-03-2008, 01:36 PM
Interesting results. I would however, like to see more variety in the bonnie, IOzone, and IOmeter configurations. These I feel actually do represent "real world" tasks; for tasks that deal with heavy I/O that is, which is why trying to simulate a variety of workloads through those tools is important. I can imagine a real fileserver streaming multiple video files while updating the locate database for example.

But I do think that your philosophy of testing with the defaults--however the software under test is supposed to be configured out-of-the-box--is a good one. It does not preclude tuning but results from tuning should never be shown alone. That said, it would also be interesting to see if there is any tuning that would make a difference for these filesystems in different cases.

kebabbert
12-03-2008, 01:37 PM
ext4 seems cool! Does it protect against silent corruption? Typically 20% of a modern hard drive is devoted to error correcting codes. Once in a while, you will run into a problem that is not correctable, or what is worse; not detectable. You dont even know that there was some error in your files:
http://www.acmqueue.org/modules.php?name=Content&pa=showpage&pid=504

And Ive heard about a large ext3 filesystem being fsck, it took one week. Does ext4 suffer from the same problem?

VaHyper
12-03-2008, 01:39 PM
EXT4 looks great in the first part of the test that tests large files. It also confirms what i have noticed that EXT3 sucks compared to XFS with big files (4 to 15GB).
Looking forward to using EXT4 on my fileserver that got lots of large mkv files :)

grigi
12-03-2008, 02:39 PM
Regarding JFS:
I'll not use it again, simply due to the amount of data I have lost on JFS, Same for ReiserFS.

Regarding XFS:
If your hardware supports write-barriers, XFS doesn't lose any data or corrupt. Just about any hardware you can buy in the last 2 years support write barriers properly, so XFS should be fine.
The defualt Linux XFS tuning parameters is "wrong" in 2 ways:
* It makes the log section way too small
* It mounts with 2 log-buffers instead of the maximum 8.

Why do I mention those 2 things?
With more logbuffers XFS handles accessing losts of small files much better, and it effectively removes the "thrashing" some people mention. This is a mount-time parameter.
With a larger log, it can handle deletes and changes much better, since with XFS it tries to queue up as much things at a time, to minimise disk seeking. The defualt log is typically ~4MB, but enlarging this to 64MB, well you can feel the performance difference. This is a filesystem-create parameter.

Other usefull options are telling XFS how the underlying RAID is configured (If you have any), and it scales extremely well, since it can keep all disks ~ equaly busy. I was amased at how well I got an XFS filesystem to perform on a 5-disk RAID 5. Truly recommended if you RAID. This is a filesystem-create parameter.



Regarding power usage:
On my notebook I originally used JFS, because it apparently used the least CPU, but this didn't improve battery life at all. In fact it might have gotten better since I changed to XFS.

Yes, XFS uses significantly more CPU power, but it completes the DISK stuff much faster. So I reckon on most systems the DISK uses more power to seek than the extra CPU cycles used to avoid 1 seek.

zhark
12-03-2008, 02:40 PM
Regarding XFS, I'm using it on partitions storing only big files, since it's excellent in that regard (as long as defragmention is done now and then). I once tried to use it as root filesystem, but that was a mistake as the performance with small files was really appaling. Since someone mentioned some tweaks to make it perform better on small files I went looking and found out that adding logbufs=8 (needs >=128MB RAM) to mount options should make it perform better.

So it will be interesting to see if it really makes a difference..

Source: http://everything2.com/index.pl?node_id=1479435

energyman
12-03-2008, 02:55 PM
Have you mounted all FS with barriers on or off?
Because ext3 turns them off by default and xfs, reiserfs on by default.
barriers cost 30% performance on ext3. If you don't made the playing field even, the benchmark is not worth the electricity bill

energyman
12-03-2008, 02:59 PM
XFS is easily tweaked to cure that.

xfs defaults are known to suck. XFS is also known for its great performance increase with some simple tweaking.

ext3 sucks. But as long as people benchmark ext3 with barriers turned off, ext3 will look good. It is sickening.

Chewi
12-03-2008, 03:35 PM
Hmm this makes me want to try XFS again but only if my drives support write barriers. I've just been trying to find a way to determine whether the drive supports them but I can't find one anywhere. Is there any way besides actually creating an XFS partition?

StringCheesian
12-03-2008, 04:05 PM
I'd like to see boot time, app start time, package manager, and maybe file indexer/searching service benchmarks.

Configuring each filesystem with settings commonly recommended by the community and including Reiser4 and JFS would be very interesting too.

ferreira
12-03-2008, 04:17 PM
I have XFS filesystem on my / partition on the desktop computer mounted with nobarrier option - I've been told this option is dangerous, but I haven't experienced any filesystem corruption since then, even with few unexpected reboots. I look forward to trying out the little XFS tweaks from this thread ^^

I use ext4dev on my laptop, and it's working normally since 2.6.26 - I had some problems with block allocation on 2.6.25, but no files were corrupted or lost.

npcomplete
12-03-2008, 10:08 PM
ext4 seems cool! Does it protect against silent corruption? Typically 20% of a modern hard drive is devoted to error correcting codes. Once in a while, you will run into a problem that is not correctable, or what is worse; not detectable. You dont even know that there was some error in your files:
http://www.acmqueue.org/modules.php?name=Content&pa=showpage&pid=504

And Ive heard about a large ext3 filesystem being fsck, it took one week. Does ext4 suffer from the same problem?

No, ext4 does not. The only filesystems that can detect data corruption are ZFS and BtrFS currently.

Jade
12-03-2008, 10:18 PM
LINUX FILESYSTEM BENCHMARKS
(includes Reiser4 and Ext4)

http://linux.50webs.org/
http://m.domaindlx.com/LinuxHelp/
http://linuxhelp.150m.com/

RESULT: With compression, REISER4, absolutely SMASHED the other filesystems.

No other filesystem came close (not even remotely close).

Using REISER4 (gzip), rather than EXT2/3/4, saves you a truly amazing 816 - 213 = 603 MB (a 74% saving in disk space), and this, with little, or no, loss of performance when storing 655 MB of raw data. In fact, substantial performance increases were achieved in the bonnie++ benchmarks.

We use the following filesystems:

REISER4 gzip: Reiser4 using transparent gzip compression.
REISER4 lzo: Reiser4 using transparent lzo compression.
REISER4 Standard Reiser4 (with extents)
EXT4 default Standard ext4.
EXT4 extents ext4 with extents.
NTFS3g Szabolcs Szakacsits' NTFS user-space driver.
NTFS NTFS with Windows XP driver.

Disk Usage in megabytes. Time in seconds. SMALLER is better.


.-------------------------------------------------.
|File |Disk |Copy |Copy |Tar |Unzip| Del |
|System |Usage|655MB|655MB|Gzip |UnTar| 2.5 |
|Type | (MB)| (1) | (2) |655MB|655MB| Gig |
.-------------------------------------------------.
|REISER4 gzip | 213 | 148 | 68 | 83 | 48 | 70 |
|REISER4 lzo | 278 | 138 | 56 | 80 | 34 | 84 |
|REISER4 tails| 673 | 148 | 63 | 78 | 33 | 65 |
|REISER4 | 692 | 148 | 55 | 67 | 25 | 56 |
|NTFS3g | 772 |1333 |1426 | 585 | 767 | 194 |
|NTFS | 779 | 781 | 173 | X | X | X |
|REISER3 | 793 | 184 | 98 | 85 | 63 | 22 |
|XFS | 799 | 220 | 173 | 119 | 90 | 106 |
|JFS | 806 | 228 | 202 | 95 | 97 | 127 |
|EXT4 extents | 806 | 162 | 55 | 69 | 36 | 32 |
|EXT4 default | 816 | 174 | 70 | 74 | 42 | 50 |
|EXT3 | 816 | 182 | 74 | 73 | 43 | 51 |
|EXT2 | 816 | 201 | 82 | 73 | 39 | 67 |
|FAT32 | 988 | 253 | 158 | 118 | 81 | 95 |
.-------------------------------------------------.


WHAT THE NUMBERS MEAN:

The raw data (without filesystem meta-data, block alignment wastage, etc) was 655MB.
It comprised 3 different copies of the Linux kernel sources.

Disk Usage: The amount of disk used to store the data.
Copy 655MB (1): Time taken to copy the data over a partition boundary.
Copy 655MB (2): Time taken to copy the data within a partition.
Tar Gzip 655MB: Time taken to Tar and Gzip the data.
Unzip UnTar 655MB: Time taken to UnGzip and UnTar the data.
Del 2.5 Gig: Time taken to Delete everything just written (about 2.5 Gig).

Each test was preformed 5 times and the average value recorded.

To get a feel for the performance increases that can be achieved by using compression, we look at the total time (in seconds) to run the test:

bonnie++ -n128:128k:0 (bonnie++ is Version 1.93c)


.-------------------.
| FILESYSTEM | TIME |
.-------------------.
|REISER4 lzo | 1938|
|REISER4 gzip| 2295|
|REISER4 | 3462|
|EXT4 | 4408|
|EXT2 | 4092|
|JFS | 4225|
|EXT3 | 4421|
|XFS | 4625|
|REISER3 | 6178|
|FAT32 | 12342|
|NTFS-3g |>10414|
.-------------------.


The top two results use Reiser4 with compression. Since bonnie++ writes test files which are almost all zeros, compression speeds things up dramatically. That this is not the case in real world examples can be seen in the first test above where compression often does not speed things up. However, more importantly, it does not slow things down much, either.

http://linux.50webs.org/
http://linuxhelp.150m.com/resources/fs-benchmarks.htm
http://m.domaindlx.com/LinuxHelp/resources/fs-benchmarks.htm

drag
12-03-2008, 11:42 PM
A couple points:

1. Reiserfs certainly isn't the first FS to support online compression. I remember there being compressed file system options going back to DOS days.

2. Most big files that everybody cares about... things like:

* audio files
* image files
* document files (like OO.org's)
* video files
* game archives


Are already compressed prior to being written to disk. Usually they use specialized compression that is much better for that specific format then generic ones like Gzip or Lzo. Ergo for anything that is especially sensitive to I/O speeds or are large are going to gain very little to no benefit from compression at the file system level.

So most the benefit you'll gain will be from large text files and executables. Maybe from certain types of database actions.

So you can get some boosts in performance in terms of application start up and whatnot. But the killer for startup performance is mostly seek times, not read limitations.. which compression does nothing to address and may actually make worse. Likely make latency worse.

Having optional online compression support is certainly a good thing, (I would love to have it) but it's no homerun hit like those benchmarks misleadingly seem to indicate.

Oh an point 3:

Reiserfs is bad for your data's health.

--------------------------------------

BTW, Btrfs has transparent compression support as one of it's listed features.

energyman
12-03-2008, 11:57 PM
point 3 is so wrong. reiserfs (3.X) had some problems in early 2.4 because of constant breakage introduced by vm changes (the in kernel code will be fixed when it is affected lie). But later 2.4 and 2.6 reiserfs is one of the most stable fs out there. Only bug fixes, no features. Just look on lkml. Weekly XFS bugs, monthly ext3 bugs, once in a while a non-reproducible reiserfs bug.
And reiser4 already works better for me than ext3 ever did.

StringCheesian
12-04-2008, 12:23 AM
LINUX FILESYSTEM BENCHMARKS
(includes Reiser4 and Ext4)

Sorry, let me clarify. I'd like to see filesystem benchmarks that include current versions of Reiser4, ext4, and so on all configured as commonly recommended by their fans.

mctop
12-04-2008, 01:31 AM
Hi,

first of all, thanks for the articel and benchmark.

We are planning to buy a new raid system with around 4TB of storage capacity (actual we have 2TB on ext3). On monthly scheduled administration days we reboot the main server for maintenance (new kernel, surely kick all nfs clients ...). So, from time to time, the raid system will check (tunefs could avoid this, but for safety reasons we perform the complete disk check) the data. This needs hours where you just can wait and wait ....

So, if ext4 would reduce this checking time, i would immediatley change.

Any experiences or a possibility to check this???

Thanks in advance

drag
12-04-2008, 01:37 AM
point 3 is so wrong. reiserfs (3.X) had some problems in early 2.4 because of constant breakage introduced by vm changes (the in kernel code will be fixed when it is affected lie). But later 2.4 and 2.6 reiserfs is one of the most stable fs out there. Only bug fixes, no features. Just look on lkml. Weekly XFS bugs, monthly ext3 bugs, once in a while a non-reproducible reiserfs bug.
And reiser4 already works better for me than ext3 ever did.


Reiser and friends refused to support and maintain Reiserfsv3 once they started work on Reiserfsv4. All file systems are very complex and have problems that are going to crop up over time, refusing to support it means that bugs and problems went unfixed, unacknowledged, and forced other developers to pick up their work if it happened at all.

Rfsv4 isn't ever going to get into the kernel at this point. At least not in any substantial way. It has no serious backing from the core Linux developers or any of the corporations or distributions that depend on Linux. Without this testing and support it's never going to mature to the point were I am going to trust it.

If Reiser was able to relate and get along with other people then it probably would of had a good chance, but this is just not how it worked out.

And speed, btw, is extremely secondary to data protection and reliability.

Ask yourself why, if companies like Redhat or IBM, that depend on the competitive nature of Linux against operating systems like Solaris and Windows 2008, have continued to put substantial time and effort into maintaining Ext3 and evolving it to Ext4 instead of just adopting the 'much better' Reiserfsv4.

Suse was a early adopter and proponent of ReiserFsv3. They have ReiserFS developers on staff. They show no sign of moving to support v4 in any meaningful way. They too depend heavily on the ability of Linux to compete with Unix, Windows, and especially Redhat. So you would think that if v4 offered a substantial advantage over the more mundane Linux file systems then they would jump at the chance to push their OS forward.

And why Intel, HP, IBM, Oracle, and Redhat are currently spending a great deal of money, developer time, and expertise working and hacking on Btrfs. People who make their living designing and developing file systems... things like Ext2/3, Advfs, Ocfsv2, Reiserfs, Lustre, GFS, and XFS.

Because they have very good reasons for this sort of thing and understanding why is critical to really knowing how to gauge file systems. Now they are not gods, are imperfect, with biases, and lots of decisions are driven by politics as much as technical reasons, but there is method behind all this madness.


-------------------------------


For example:

Ext3 fsck is something that is rather powerful at protecting and repairing file systems. It's robust, proven, and reliable.

Ext3 journalling also has the capabilities to monitor and detect issues with actual files. Sure it's not as good as checksums, but it does have the ability to do journalling for files. Now other journalling FSs, like Reiser and XFS only monitor file system metadata (that is they monitor the FS and ignore the data store using the FS) and that is all they are capable of dealing with. (unfortunately with Ext3 there are bugs for a long time that tended to limit the usefulness of this feature)

If you run Reiserfsck on a file system with a file with a loopback file with Reiserfs on it.. then it will often try to stitch your loopback file into the file system. Which is the sort of thing you very much do not want.

That's a problem with v3 that was never fixed. It isn't suppose to be a problem v4, but keep in mind that unlike Ext2->Ext3->Ext4 each new Reiser file system is rewritten from scratch and are not related to one another in any direct manner.

kjgust
12-04-2008, 02:11 AM
RESULT: With compression, REISER4, absolutely SMASHED the other filesystems.

Oh dear.. Well first off how can I say this.. You just made me CHOKE on my coffee. Haha, you know, the only time I used reiserFS, it was a bad experience, eventually ;). So even if it is faster, its definitely not as proven or as reliable as something like EXT3. I personally wouldn't be surprised to see ReiserFS3 be removed from the Linux Kernel eventually. Because from my experience at least, and what I've heard from others, its really not that good.

deanjo
12-04-2008, 02:21 AM
Suse was a early adopter and proponent of ReiserFsv3. They have ReiserFS developers on staff. They show no sign of moving to support v4 in any meaningful way. They too depend heavily on the ability of Linux to compete with Unix, Windows, and especially Redhat. So you would think that if v4 offered a substantial advantage over the more mundane Linux file systems then they would jump at the chance to push their OS forward.


Just to add on:

Suse dropped Reiser as it's default filesystem because of several technical problems, as well as problems related to maintenance especially after Chris Mason left (the people basically left holding th bag on maintaining it). That left basically Mahoney to look after it and with it's bug ridden past it just became to big of a headache. It also wasn't so shit hot in performance or reliabilty as well. I wouldn't be surprised if it is soon dropped from the supported filesystems all together in suse.

ReiserFS has no future. It's effectively dead. Time to put it up on the shelf with other innovations like the Superdisk 120 and the 80186.

Jade
12-04-2008, 04:20 AM
Which bit of this didn't you understand?

LINUX FILESYSTEM BENCHMARKS
(includes Reiser4 and Ext4)

http://linux.50webs.org/
http://m.domaindlx.com/LinuxHelp/
http://linuxhelp.150m.com/

Some Amazing Filesystem Benchmarks. Which Filesystem is Best?
http://www.phoronix.com/forums/showthread.php?t=1765

RESULT: With compression, REISER4, absolutely SMASHED the other filesystems.

No other filesystem came close (not even remotely close).

Using REISER4 (gzip), rather than EXT2/3/4, saves you a truly amazing 816 - 213 = 603 MB (a 74% saving in disk space), and this, with little, or no, loss of performance when storing 655 MB of raw data. In fact, substantial performance increases were achieved in the bonnie++ benchmarks.

We use the following filesystems:

REISER4 gzip: Reiser4 using transparent gzip compression.
REISER4 lzo: Reiser4 using transparent lzo compression.
REISER4 Standard Reiser4 (with extents)
EXT4 default Standard ext4.
EXT4 extents ext4 with extents.
NTFS3g Szabolcs Szakacsits' NTFS user-space driver.
NTFS NTFS with Windows XP driver.

Disk Usage in megabytes. Time in seconds. SMALLER is better.


.-------------------------------------------------.
|File |Disk |Copy |Copy |Tar |Unzip| Del |
|System |Usage|655MB|655MB|Gzip |UnTar| 2.5 |
|Type | (MB)| (1) | (2) |655MB|655MB| Gig |
.-------------------------------------------------.
|REISER4 gzip | 213 | 148 | 68 | 83 | 48 | 70 |
|REISER4 lzo | 278 | 138 | 56 | 80 | 34 | 84 |
|REISER4 tails| 673 | 148 | 63 | 78 | 33 | 65 |
|REISER4 | 692 | 148 | 55 | 67 | 25 | 56 |
|NTFS3g | 772 |1333 |1426 | 585 | 767 | 194 |
|NTFS | 779 | 781 | 173 | X | X | X |
|REISER3 | 793 | 184 | 98 | 85 | 63 | 22 |
|XFS | 799 | 220 | 173 | 119 | 90 | 106 |
|JFS | 806 | 228 | 202 | 95 | 97 | 127 |
|EXT4 extents | 806 | 162 | 55 | 69 | 36 | 32 |
|EXT4 default | 816 | 174 | 70 | 74 | 42 | 50 |
|EXT3 | 816 | 182 | 74 | 73 | 43 | 51 |
|EXT2 | 816 | 201 | 82 | 73 | 39 | 67 |
|FAT32 | 988 | 253 | 158 | 118 | 81 | 95 |
.-------------------------------------------------.


WHAT THE NUMBERS MEAN:

The raw data (without filesystem meta-data, block alignment wastage, etc) was 655MB.
It comprised 3 different copies of the Linux kernel sources.

Disk Usage: The amount of disk used to store the data.
Copy 655MB (1): Time taken to copy the data over a partition boundary.
Copy 655MB (2): Time taken to copy the data within a partition.
Tar Gzip 655MB: Time taken to Tar and Gzip the data.
Unzip UnTar 655MB: Time taken to UnGzip and UnTar the data.
Del 2.5 Gig: Time taken to Delete everything just written (about 2.5 Gig).

Each test was preformed 5 times and the average value recorded.

To get a feel for the performance increases that can be achieved by using compression, we look at the total time (in seconds) to run the test:

bonnie++ -n128:128k:0 (bonnie++ is Version 1.93c)


.-------------------.
| FILESYSTEM | TIME |
.-------------------.
|REISER4 lzo | 1938|
|REISER4 gzip| 2295|
|REISER4 | 3462|
|EXT4 | 4408|
|EXT2 | 4092|
|JFS | 4225|
|EXT3 | 4421|
|XFS | 4625|
|REISER3 | 6178|
|FAT32 | 12342|
|NTFS-3g |>10414|
.-------------------.


The top two results use Reiser4 with compression. Since bonnie++ writes test files which are almost all zeros, compression speeds things up dramatically. That this is not the case in real world examples can be seen in the first test above where compression often does not speed things up. However, more importantly, it does not slow things down much, either.

http://linux.50webs.org/
http://linuxhelp.150m.com/resources/fs-benchmarks.htm
http://m.domaindlx.com/LinuxHelp/resources/fs-benchmarks.htm

nanonyme
12-04-2008, 04:21 AM
but keep in mind that unlike Ext2->Ext3->Ext4 each new Reiser file system is rewritten from scratch and are not related to one another in any direct manner.
Then again, ext4 with extents is not compatible with ext3 or ext2 at all. You cannot ever never mount a fully featured ext4 fs as them. I actually found it weird this benchmark lacked fsck tests for ext3 and ext4. The speedup in those is the main thing I'm looking forward into. My conclusions from the benchmarks would be that ext4 excels with big files, probably due to the new extents, but performance difference otherwise is not significant. Performance differences would likely have been smaller with smaller test files. Maybe the other tests didn't deal with gigabytes of data? "Extents are introduced to replace the traditional block mapping scheme used by ext2/3 filesystems. An extent is a range of contiguous physical blocks, improving large file performance and reducing fragmentation. A single extent in ext4 can map up to 128MiB of contiguous space with a 4KiB block size." Wikipedia. And yeah, you need to fully reformat the hard disk to get full benefits of ext4 afaik. (That is, extents for old files too)

kebabbert
12-04-2008, 05:11 AM
Hi,

first of all, thanks for the articel and benchmark.

We are planning to buy a new raid system with around 4TB of storage capacity (actual we have 2TB on ext3). On monthly scheduled administration days we reboot the main server for maintenance (new kernel, surely kick all nfs clients ...). So, from time to time, the raid system will check (tunefs could avoid this, but for safety reasons we perform the complete disk check) the data. This needs hours where you just can wait and wait ....

So, if ext4 would reduce this checking time, i would immediatley change.

Any experiences or a possibility to check this???

Thanks in advance
Have you tried Solaris and ZFS? ZFS has no fsck. Instead, it has something called "scrub" but your ZFS raid is online and fully functioning mean while. Ive heard that to fsck a large ext3 took one week!

Here is a Linux admin comparing ZFS with linux filesystems:
http://lethargy.org/~jesus/archives/77-Choosing-Solaris-10-over-Linux.html

Here is a Linux guy setting up a home file server ZFS:
http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/

ZFS + 48 SATA discs + dual Opteron and no hardware raid (just plain SATA controller), writes more than 2 GB/sec:
http://milek.blogspot.com/2006/11/thumper-throughput.html

And SUN is selling a new Storage device, 7000. Read about "The Killer App". You could download and play with that analysis software that uses unique DTrace in a VMware image (which simulates several discs with ZFS raid):
http://blogs.sun.com/mws/entry/introducing_the_sun_storage_7000




Create a ZFS raid:
# zpool create myZFSraid disc0 disc1 disc2 disc3
and that is all. No formatting needed, just bang away immediately. Dead simple administration.

llama
12-04-2008, 07:51 AM
first, the bonnie++ benchmark is nonsense. I downloaded the benchmark suite, and
pts/test-resources/bonnie/install.sh makes a bonnie script that will run
./bonnie_/sbin/bonnie++ -d scratch_dir/ -s $2 > $LOG_FILE 2>&1" > bonnie

-s controls the size of the big file used in sequential write/rewrite/read and lseek tests, and has no impact on the multiple file creation/read/deletion test. The defaults for that are -n 10:0:0:0, IIRC. That means bonnie++ will create 10 * 1024 empty files in the scratch directory. This mostly tests the kernel's in-memory cache structures, since that's not big enough to fill up the memory, so you're not waiting for anything to happen on disk. The deletion does have to happen on disk for anything that made it to disk before being deleted, which can be a bottleneck.
-n 30:50000:200:8 would be a more interesting test, probably. (file sizes between 50kB (not kiB) and 200B, 30*1024 files spread over 8 subdirs)



A few people have pointed out that XFS has stupid defaults, but nobody posted a good recommendation. I've played with XFS extensively and benchmarked a few different kinds of workloads on HW RAID5 and on single disks. And I've been using it on my desktop for several years now. For general purpose use, I would recommend:

mkfs.xfs -l lazy-count=1,size=128m -L yourlabel /dev/yourdisk
mount with -o noatime,logbsize=256k (put that in /etc/fstab)

lazy-count: don't keep the counters in the superblock up to date all the time, since there's enough info elsewhere. fewer writes = good.

-l size=128m: XFS likes to have big logs, and this is the max size.

mount -o logbsize=256k: That's log buffer size = 256kiB (of kernel memory). The default (and max with v1 logs) is 32kiB. This makes a factor of > 2 performance difference on a lot of small-file workloads. I think logbufs=8 has a similar effect (the default is 2 log bufs of size 32k. I haven't tested logbus=8,logbsize=256k. The XFS devs frequently recommend to people asking about perf tuning on the mailing list that they use logbsize=256k, but they don't mention increasing logbufs too.


If you have an older mkfs.xfs, get the latest xfsprogs, 2.10.1 has better defaults for mkfs (e.g. unless you set RAID stripe params, agcount=4, which is about as much parallelism as a single disk can give you anyway. The old default was much higher agcount, which could slow down when the disk started to get full.)

Or just use your old mkfs.xfs and specify agcount:
mkfs.xfs -l lazy-count=1,size=128m -L label /dev/disk -d agcount=4 -i attr=2


If you want to start tuning, read up on XFS a bit. http://oss.sgi.com/projects/xfs/ (unfortunately, there's no good tuning guide anywhere obvious on the web site). Read the man page for mkfs.


You can't change the number of allocation groups without a fresh mkfs, but you can enable version 2 logs, and lazy-count, without mkfs. xfs_admin -j -c1 will switch to v2 logs with lazy-count enabled. xfs_growfs says growing the log size isn't supported, which is a problem if you have less than the max size of 128MB, since XFS loves large logs. It lets it have more metadata ops on the fly, instead of being forced to write them out sooner.

if your FS is bigger than 1TB, you should mount with -o inode64, too. Note that contrary to the docs, noikeep is the default. I checked the kernel sources, and that's been the case for a while, I think. Otherwise I would recommend using noikeep to reduce fragmentation.



If you're making a filesystem only a couple GB, like a root fs, a 128MB log will take a serious chunk of the available space. You might be better of with JFS. I'm currently benchmarking XFS with tons of different option combinations for use as a root fs... (XFS block size, and log size, lazy-count=0/1, mount -o logbsize=, and block dev readahead and io elevator)

I use LVM for /usr, /home, /var/tmp (includes /var/cache and /usr/local/src), so my root FS currently is a 1.5GB JFS filesystem that is 54% full. It's on a software RAID1.
Since I run Ubuntu, my /var/lib/dpkg/info has 9373 files out of the total 20794 regular files (27687 inodes) on the filesystem, most of them small.

export LESS=iM
find / -xdev -type f -ls | sort -n -k7 | less -S
then look at the % in less's status line. or type 50% to go to 50% of the file position.
<= 1k: 45%
<= 2k: 52%
<= 3k: 58% (mostly /var/lib/dpkg/info)
<= 4k: 59%
<= 6k: 62%
<= 8k: 64%
<= 16k: 71% (a lot of kernel modules...)
<= 32k: 85%
<= 64k: 93%
<= 128k: 96%

> 1M: 0.2% (57 files)

(I started doing this with find without -type f, and there are lots of small directories (that don't need any blocks outside the inode): < 1k: 59%; < 2k: 64%; < 3k: 68%)

Every time dpkg upgrades a package, or I even run dpkg -S, it reads /var/lib/dpkg/info/*.list (and maybe more). (although dlocate usually works as a replacement for dlocate -S). This usually takes several seconds when the cache is cold on my current JFS filesystem that I created ~2 years ago when I installed the system. This is what I notice as slow on my root filesystem currently. JFS is fine with hot caches, e.g. for /lib, /etc, /bin, and so on. But dpkg is always very slow the first time.

Those small files are probably pretty scattered now, and probably not stored in anything like readdir() order or alphabetical order. I'm hoping XFS will do better than JFS at keeping down fragmentation, although it probably won't. It writes files created at the same time all nearby (it actually tries to make contiguous writes out of dirty data). It doesn't look at where old files in the same directory are stored when trying to decide where to put new files, AFAIK. So I'll probably end up with more scattered files. At least with XFS's batched writeout, mkdir info.new; cp -a info/* info.new; mv ... ; rm -r ...; will work to make a defragged copy of the directory and files in it. (to just defrag the directory, mkdir info.new; ln info/* info.new/; That can make readdir order = alphabetical order. Note using *, which expands to a sorted list, instead of using just cp -a, which will operate in readdir order. dpkg doesn't read in readdir order, it goes (mostly?) alphabetically by package name (based on its status file).)

Anyway, I'm considering using a smaller data block size, like -b size=2k or size=1k, (but -n size=8k, I definitely don't want smaller blocks for directories. There are a lot of tiny directories, but they won't waste 8k because there's room in the inode for their data. See directory sizes with e.g. ls -ld. Larger directory block sizes help to reduce directory fragmentation. And most of the directories on my root filesystem that aren't tiny are fairly large. xfs_bmap -v works on directories, too, BTW). XFS is extent-based, so a small block size doesn't make huge block bitmaps even for large files.

I think I was finding that smaller data block sizes were using more CPU than the default 4k (=max=page size) in hot-cache situations. I compared some results I've already generated, and 1k or 2k does seem slightly faster for untarring the whole FS; drop_caches; tar c | wc -c (so stat+read) ; drop_caches; untar again (overwrite); drop_caches; read some more, timing each component of that. My desktop has been in single-user mode for 1.5 days testing this. :) I should post my results somewhere when I'm done... And I need to find a good way to explore the 5 (or higher) dimensional data (time as a function of block size, log size, logbuf size, lazy-count=0/1, and deadline vs. cfq, and blockdev --setra 256, 512, or 1024 if I let my tests run that long...).


BTW, JFS is good, and does use less CPU. That won't reduce CPU wakeups to save power, though. FS code mostly runs when called by processes doing a read(2), or open(2), or whatever. Filesystems do usually start a thread to do async tasks, though. But those threads shouldn't be waking up at all when there's no I/O going on.
I decided to use JFS for my root FS a couple years ago after reading
http://www.sabi.co.uk/blog/anno05-4th.html#051226b. I probably would have used XFS, but I hadn't realized that to work around the grub-install issue you just have boot grub from a USB stick or whatever, and type root (hd0,0); setup (hd0). I recently set up a bioinformatics cluster using XFS for root and all other filesystems. It works fine, except that getting GRUB installed is a hassle.

Also BTW, there's a lot of good reading on www.sabi.co.uk. e.g. suggestions for setting up software RAID, http://www.sabi.co.uk/blog/0802feb.html#080217, and lots of filesystem stuff:
http://www.sabi.co.uk/blog/0804apr.html#080415
http://www.sabi.co.uk/Notes/linuxFS.html


XFS is wonderful for large files, and has some neat other features. If you download torrents, you usually get fragmented files because they start sparse and are written in the order the blocks come in. xfs can preallocate space without actually writing it, so you end up with a minimally-fragmented file. azureus has an option to use xfs_io's resvsp command. Linux now has an fallocate(2) command which should work for XFS and ext4. posix_fallocate(3) should use it. I'm not sure if fallocate is actually implemented for xfs yet, but I would hope so since its semantics are the same. And I don't know what glibc version includes an fallocate(2) backend for posix_fallocate(3).
And xfs has nice tools, like xfs_bmap to show you the fragmentation of any file.

bhaskar
12-04-2008, 12:03 PM
http://sourceforge.net/project/showfiles.php?group_id=11026&package_id=298597 is open source, well documented, and creates a workload that simulates a high end transaction processing database engine.

Disclosure: I manage the product / product (GT.M - http://fis-gtm.com and http://sourceforge.net/projects/fis-gtm) that released io_thrash.

Jade
12-04-2008, 12:52 PM
The bonnie++ options used in the benchmarks at:

http://linux.50webs.org/
http://m.domaindlx.com/LinuxHelp/
http://linuxhelp.150m.com/

were bonnie++ -n128:128k:0

The -n128 means that the test wrote, read and deleted 128k (131,072) files. These were first sequentially, then randomly, written/read/deleted to/from the directory.

The :128k:0 means that every file had a random size between 128k (131,072 bytes) and zero. So the average file-size was 64k.

psycho_driver
12-04-2008, 01:12 PM
I'll be honest, I'm a little confused about using games as a benchmark for a filesystem. Games load resources from the disk before the game play starts, everything from that point on is stored in either RAM or VRAM while the game is in play (unless of course you run out of memory). Only an insane game developer would read or write from the disk during gameplay because it would kill frame rate.

If you were timing the loading times (or game saves) fair enough, but using the frame rate as a bench mark seems pointless.

Some games certainly do load textures on the fly. Guild Wars is such a game.

I think testing game performance isn't a bad idea, but average FPS isn't a good indicator. A utility that works like fraps should be utilized which will show lowest fps/highest fps. The lowest fps score would be the more interesting statistic in a game known to load textures on the fly, even if running under wine.

psycho_driver
12-04-2008, 01:16 PM
Oh dear.. Well first off how can I say this.. You just made me CHOKE on my coffee. Haha, you know, the only time I used reiserFS, it was a bad experience, eventually ;). So even if it is faster, its definitely not as proven or as reliable as something like EXT3. I personally wouldn't be surprised to see ReiserFS3 be removed from the Linux Kernel eventually. Because from my experience at least, and what I've heard from others, its really not that good.

I knew Jade would make an appearance in this thread. His obsession with ReiserFS isn't healthy.

I've used ReiserFS twice, and both times I had catostrophic filesystem failures within about a year.

Jade
12-04-2008, 01:19 PM
No comment from drag (or anybody else) on this?

Suse was a early adopter and proponent of ReiserFsv3. They have ReiserFS developers on staff.

At least this statement of yours is true.

Suse has supported and distributed Reiser3 ever since January 2000 (in SuSE 6.3).

They show no sign of moving to support v4 in any meaningful way.

This is TOTAL CRAP. Suse supported and distributed Reiser4 for years.

They were almost the only ones supporting it.

They too depend heavily on the ability of Linux to compete with Unix, Windows, and especially Redhat.

This is TOTAL CRAP as well.

Reiser4 was supported by SuSE till they were bought out by the Jews. The Jews already owned Redhat, so there was no competition.

So you would think that if v4 offered a substantial advantage over the more mundane Linux file systems then they would jump at the chance to push their OS forward.

The (German company SuSE) did "jump at the chance," as a non-fairy tale version of history substantiates.

When the Jews bought it out, they worked hard on getting rid of KDE, Reiser3, Reiser4, mp3 support and NTFS support.

Destruction of Linux NTFS support got away from them when Szabolcs Szakacsits released his NTFS driver.

They removed mp3 support from SuSE 10. Thus I stopped using SuSE, so I don't know if it is still sabotaged in this way.

Reiser4 has been successfully shut down by sabotage of the Linux kernel code due to Andrew Morton.

They are still trying to kill Reiser3, but too many people know that for years it was the best filesystem available and it is proving hard for them to get rid of it.

There was a huge user rebellion against the move to Gnome and KDE stayed,... at least for now.

drag
12-04-2008, 05:13 PM
Suse supporting Reiserfs in a meaningful way would mean that they support using v4 as a install option. Which they don't.



Reiser4 was supported by SuSE till they were bought out by the Jews. The Jews already owned Redhat, so there was no competition.


So the Jews hate ReiserFS?

--------------------

My good sir.

You are either serious and happen to be borderline insane; Or are a batshit insane troll. Either way you have too much time on your hands and seem to have a almost complete lack of critical thinking skills.

I suggest a double dose of a BS degree in liberal arts at a very high quality private university (the more conservative the better) combined with counseling. From looking at your website you seem to have some serious delusions and possibly schizophrenic tendencies. If you already have a person your seeing, fire him/her, and if you already have a degree try to get your money back to pay for a better one.

And probably some better religion, if your into that sort of thing. It's helped lots of people in the past get a better grounding.

I may sound insulting, but it's really for your own benefit.

energyman
12-04-2008, 05:17 PM
resierfs IS NOT reiser4

god people, don't you know anything?

reiserfs = reiserfs 3.5&3.6
reiser4 = reiser4

no 'reiserfs' for 4, and no 4 in reiserfs. Two completly different file systems.

Get your facts straight, before you look completly silly, ok?
Oh, and jade - all the points you might have are invalidated by your idiotic (yes, I said it), conspracy theories and jew hating.

kebabbert
12-04-2008, 05:34 PM
JADE,

You know, it is nice that you share your opinions about the jews with us. But that is not facts, as you can not back them up with links. So please stay on topic then, when you discuss your opinions? Back to file systems.

thacrazze
12-05-2008, 07:44 AM
Why isno JFS benched? It's the fastest FS for Linux and it supports 64bit since a lot of years.

Jade
12-05-2008, 04:13 PM
Why is no JFS benched? It's the fastest FS for Linux
No it is not. Reiser4 is the fastest. See:

http://linux.50webs.org/
http://linuxhelp.150m.com/resources/fs-benchmarks.htm
http://m.domaindlx.com/LinuxHelp/resources/fs-benchmarks.htm

Chewi
12-05-2008, 04:27 PM
Hahaha. What is with this guy? I thought ReiserFS was okay but geez. ReiserFS and Reiser4 are already associated with one weirdo too many.

energyman
12-05-2008, 04:34 PM
Chewi, it is associated with one 'weirdo' because of people like you who can't see behind technical merits.

reavertm
12-05-2008, 11:52 PM
I'm using reiserfs ('3') for at least 5 years and I never ever had any data loss nor filesystem corruption. It handled perfectly every power loss I occured. It's rock stable and is eating alive ext3 when it comes to performance.
Recovering accidentally *deleted* files is another topic as it may be problematic, especially when you have some reiserfs file image on partition being recovered.
Other disadvantages (the only actually) of reiserfs are longer mount time and reasonably higher CPU usage. But please stop spreading bull**** about its unreliability.

On the other hand I remember quite often problems with ext3 and it caused for me to forget about ext* forever for anything except /boot.

deanjo
12-06-2008, 12:29 AM
But please stop spreading bull**** about its unreliability.
.

Your lucky, that was not the case 3 years ago when openSUSE dropped it from being the default filesystem. Feel free to look around at the mailing lists from back then.

energyman
12-06-2008, 12:33 AM
just look at lkml.
Every month you will see reports for fs corruption and bugs - xfs and ext3.
You hardly ever see a report for reiserfs.
And it wasn't dropped because it was unstable - it was dropped because Namesys wasn't willing to add any new features. They wanted to put reiserfs in maintanace only mode. ext3 is much 'better' constantly adding features - and bugs.

deanjo
12-06-2008, 03:06 AM
just look at lkml.
Every month you will see reports for fs corruption and bugs - xfs and ext3.
You hardly ever see a report for reiserfs.
And it wasn't dropped because it was unstable - it was dropped because Namesys wasn't willing to add any new features. They wanted to put reiserfs in maintanace only mode. ext3 is much 'better' constantly adding features - and bugs.

Here is the exact reasons why suse dropped it.

Hi all -

We've been using ReiserFS as our default installation file system for
the last 6-7 years now, and it's served us well in that time.
Unfortunately, there are a number of problems with it, some purely
technical, some more related to maintenance. I'll outline a few of the
larger issues and offer my solution as a conclusion.

ReiserFS has serious scalability problems. David Chinner's talk at OLS
really underscored the problem well for a single, large, high bandwidth
file system. While I realize that XFS-style scalability isn't a real
goal for most users, ans isn't a target workload for reiserfs, the
scalability problems are real. ReiserFS uses the BKL for synchronization
everywhere, and since it's system-global lock, the problem doesn't go
away when you split the file system into smaller ones. Lock contention
alone is one problem, but it's made worse by cache bouncing between
processors on larger systems.

ReiserFS has serious performance problems with extended attributes and
ACLs. (Yes, this one is my own fault, see numerous flamewars on lkml and
reiserfs-list for my opinion on this.) xattrs are backed by normal files
rooted in a hidden directory structure. This is bad for performance and
in rare cases is deadlock prone due to lock inversions between pdflush
and the xattr code. The quota code gets around this, but the fix would
result in huge amounts of wasted space with ReiserFS. With increasing
deployment of SLES as samba servers, and perhaps NFSv4 servers, the use
of extended attributes will only increase.

ReiserFS has a small and shrinking development community. Right now, the
only developers really working with ReiserFS are Chris Mason, Jan Kara
(internally), a rotating member of Hans Reiser's team, and myself. All
of us have projects we're very much more interested in than working with
ReiserFS. While Jan and I will be continuing to support ReiserFS for
SUSE, Hans is increasingly (hard to believe) pushing people to use
reiser4. Chris has moved on to Oracle and has expressed his opinions on
leaving ReiserFS behind.

ReiserFS v3 is a dead end. Hans has been pushing reiser4 for years now
and declared Reiser3 in maintenance mode. Any changes that aren't bug
fixes are met with violent resistance. Reiser4 is not an incremental
update and requires a reformat, which is unreasonable for most people.
Reiser3 lacks a number of features that other file systems either have
or are adding soon, such as extents and growth beyond current limits.
Since it's in maintenance mode, that's unlikely to change. I view
reiser4 as an interesting research file system, but that's about as far
as it goes. I've been unimpressed with its stability so far. I don't
know how advanced the recovery tools are yet, but I suspect that the
complexity of the format and the ability to essentially define the
format on-the-fly with plugins will make a useful fsck extremely difficult.

The solution for replacing an aging file system isn't to switch to a
brand new unproven file system, but rather a proven one with a clear
upgrade path. That file system is ext3.

Ext3's performance in some situations may not be on par with Reiser3,
but it scales better and Andi mentioned the other day that there is
quite a bit of research going into improving the locking and general
performance of ext3 going on right now, and since reiser3 is stagnant, I
don't doubt they'll pass them soon.

Ext3 has a much larger development community out there. Most other
distributions use ext3 as their default file system, so bugs that don't
end up getting reported to us and are fixed by other developers, we get
for free - most Reiser3 fixes originate from Chris or I.

Ext3 has a clear upgrade path. There is quite a bit of interest in the
community in improving ext3, and ext4 is already under development. Like
the upgrade path from ext2 to ext3, the path to ext4 is clearly defined.
Existing file systems can be updated easily, and new files will be able
to take advantage of the new features. Features already written and
queued up include extents, a 64-bit journal, and 64-bit file sizes.

Most of the institutional knowledge of reiserfs is bouncing around in my
head. Jan has been getting his hands dirty a little bit, but beyond
that, finding additional developers with reiserfs experience will be
extremely difficult and I'd call training additional developers a wasted
effort. Since reiserfs is in maintenance mode, the effort needed to
continue to support it in future releases should be shrinking.

To be clear, my long term goal is to use OCFS2 (or another CFS if one
shows a clear adoption advantage) for the root file system. This would
enable single-instance clustering at both the physical and the virtual
distribution level and get us ease of management and flexibility in HA
deployments. Realistically, though, desktop users are likely to continue
to use ext[34] for the foreseeable future. Until we have OCFS2 (and the
rest of the distribution) ready for such a deployment described above on
larger servers, ext3 would be a suitable choice across the board.

- -Jeff

- --
Jeff Mahoney
SUSE Labs

StringCheesian
12-06-2008, 03:19 AM
just look at lkml.
Every month you will see reports for fs corruption and bugs - xfs and ext3.
You hardly ever see a report for reiserfs.
While that reflects the stability of those filesystems, it's also slanted by their popularity/exposure. In other words, given 2 filesystems of equal quality, one twice as popular should average about twice as many bug reports. Simply counting bug reports would mistake obscurity for stability and broad exposure for bugginess.

energyman
12-06-2008, 03:47 AM
and how many users use need acl/xattrs? I know many linux users personally - and not one needs or uses any of the two. BKL is a red herring too. The BKL is used in so many places in the kernel, that the BKL usage of reiserfs is just a drop into an ocean. Same for 'scalability'. Who cares that reiserfs does not deal as well as xfs with 10s of disks, and several terabytes of data?

stevenaaus
12-06-2008, 04:15 AM
Sorry, but you're losing credibility. As others have said, these tests are mostly useless: CPU bound or huge files. Do we really have to point out some relevant benches such as kernel compilation, boot-time or rsync archives ?

Jade
12-06-2008, 05:28 AM
I'm using reiserfs ('3') for at least 5 years and I never ever had any data loss nor filesystem corruption. It handled perfectly every power loss I occured. It's rock stable and is eating alive ext3 when it comes to performance.
I'm using reiser3 for at about 7 years and I never ever had any data loss nor filesystem corruption. It handled perfectly every power loss that occured. It's rock stable and eats ext3 alive when it comes to performance.

There was one exception when using QEMU (most likely due to various Gentoo saboteurs getting to the code) which you can read about here:

http://linux.50webs.org/gentoo/qemu-gentoo.htm

A quote from that page:

"I would have usually formated /dev/hdb3 with the Reiser3 filesystem, however, this usually extremely reliable filesystem, has been deliberately sabotaged. This sabotage causes massive corruption of files after a short time. By comparison, I have used the Reiser3 filesystem for nearly ten years (actually about 7 or 8 years (there was no ext3 when I changed)) now and have never had any corruption, at all, on any of my many Reiser3 boxes."

yoshi314
12-06-2008, 06:19 AM
jade, i have found a _real_ conspiracy for you :

check this out:

http://repos.archlinux.org/viewvc.cgi/gcc/trunk/PKGBUILD?view=markup

this is a gcc build script for arch linux :

if ! locale -a | grep ^de_DE; then
echo "You need the de_DE locale to build gcc."
return 1
fi

it must be an german conspiracy to take over the world! ;-)

that aside, it still beats me what does it need de_DE for ?

curaga
12-06-2008, 06:24 AM
From LFS:The following instructions will install the minimum set of locales necessary for the optimal coverage of tests:It's for the test suites.

greg
12-06-2008, 01:45 PM
ReiserFS, with its balanced trees, is extremely sensitive to CPU and memory errors. Imagine a RAM module goes bad, and you only notice it a few days later. This can be too late already and half the file system is destroyed. It's what happened to me.

Anyway, I don't see what all the fuss is about. ext3/4 are stable, perform well and have an okay feature set for most uses.

energyman
12-06-2008, 02:07 PM
ext3 is not stable, and ext4 is pre alpha code.

stevenaaus
12-06-2008, 03:26 PM
ReiserFS, with its balanced trees, is extremely sensitive to CPU and memory errors. Imagine a RAM module goes bad, and you only notice it a few days later. This can be too late already and half the file system is destroyed. It's what happened to me.

Yes... Though these issues are already known (except by people with marginal hardware who won't acknowledge the fact ;>). Reiserfs (v3) is imho easily the best linux filesystem. I've used it exclusively for over 5 years on dozens of installs and never have had a problem except for an overheating disk.

Having said that, A solution to any problem must address all issues, technical and social. Reiserfs v3 and v4 are unfortunately marginalised by
1) (above quote)
2) Hans' no-nonsense attitude to various people
3) Hans' legal problem
4) reiserfsck tree splicing bug.
5) Reiser4's non-inclusion in kernel due to (2) (3).

As for Ext3/4, I can't comment personally as my few experiences of ext3 were so sluggish i've hardly used it.... But i suspect energyman is right ( ;> ).

Jade
12-06-2008, 04:26 PM
Here is the exact reasons why suse dropped it.
Actually, the real reason why SuSE dropped it is to be found in here somewhere:

Suse was a early adopter and proponent of ReiserFsv3. They have ReiserFS developers on staff.

At least this statement of yours is true.

Suse has supported and distributed Reiser3 ever since January 2000 (in SuSE 6.3).

They show no sign of moving to support v4 in any meaningful way.

This is TOTAL CRAP. Suse supported and distributed Reiser4 for years.

They were almost the only ones supporting it.

They too depend heavily on the ability of Linux to compete with Unix, Windows, and especially Redhat.

This is TOTAL CRAP as well.

Reiser4 was supported by SuSE till they were bought out by the Jews. The Jews already owned Redhat, so there was no competition.

So you would think that if v4 offered a substantial advantage over the more mundane Linux file systems then they would jump at the chance to push their OS forward.

The (German company SuSE) did "jump at the chance," as a non-fairy tale version of history substantiates.

When the Jews bought it out, they worked hard on getting rid of KDE, Reiser3, Reiser4, mp3 support and NTFS support.

Destruction of Linux NTFS support got away from them when Szabolcs Szakacsits released his NTFS driver.

They removed mp3 support from SuSE 10. Thus I stopped using SuSE, so I don't know if it is still sabotaged in this way.

Reiser4 has been successfully shut down by sabotage of the Linux kernel code due to Andrew Morton.

They are still trying to kill Reiser3, but too many people know that for years it was the best filesystem available and it is proving hard for them to get rid of it.

There was a huge user rebellion against the move to Gnome and KDE stayed,... at least for now.

llama
12-06-2008, 06:04 PM
While that reflects the stability of those filesystems, it's also slanted by their popularity/exposure. In other words, given 2 filesystems of equal quality, one twice as popular should average about twice as many bug reports. Simply counting bug reports would mistake obscurity for stability and broad exposure for bugginess.

Esp. when you consider that probably a significant fraction of the data loss reports (for any filesystem) are partly due to flaky hardware (bad RAM, bad power supply, ...) more users = more corruption pretty much independently of the reliability of the filesystem code (on good hardware). Esp. distro default filesystems will get used by less-expert users who are probably more likely to have less-good hardware. (My theory here is that people who are computer-savvy enough to build a machine with quality parts might be more likely to choose non-default filesystems.)

edit: forgot to say: My guess is that people usually switch to a new filesystem when the set up a new machine and do a fresh install, so their experience of one filesystem is on one machine, and of another filesystem on another machine. If one of those machines sometimes had memory errors, or data errors on the way from disk to memory, that will affect their impression. So everyone out there who's had bad luck with a certain FS, think back and see if you remember trying other filesystems on the same hardware. If not, maybe it was marginal hardware causing the problems.

Sensitivity to memory errors should be a factor in filesystem choice if you run with a cheapo power supply and RAM you don't really trust. Or if you overclock. I try to use quality hardware, since having a computer that gets wrong answers completely defeats the purpose of a having a dumb but fast machine. Also, I want to use svn/git versions of things and make bug reports without wondering if maybe they crashed because my hardware is flaky. If I just wanted my commercial games to run fast, I'd probably be a lot more interested in overclocking.

Other than disk failures, I've never experienced FS corruption that lost any data on my own systems. I guess I've been lucky. :) I've used ext3 quite a bit in the past, and I would still use ext3 with data=journal for a FS that I _really_ didn't want to lose anything on after a crash. I use ext3 on my laptop because it dual-boots winxp, and there aren't windoze drivers for any other good filesystems. I have a ~10 year old system (P3-450) as a router/server on my cable modem, and I ssh to it to read my mail (with mutt) from anywhere in the world. It's been running Debian with ext3, and hasn't been re-installed since March 2001. It uses ext3 everywhere, and I've never had a problem with FS corruption.

I've used reiserfs a few times, and it's been ok. It's not extent-based, so it's slow to delete large files (like ext3).

I currently use JFS as my root filesystem on my desktop. I've also used it on a few HPC cluster machines I've set up at my former work.

I think reiser4 has some very interesting ideas. Years ago, I read some of the papers on namesys.com about making the filesystem "smarter". I've always been a fan of adding interesting features to GNU/Linux, more than having it be a 100% perfect re-implementation of Unix. If reiser4 had caught on, and file managers had started using its features to search files by tags or something, I might be using a file manager instead of continuing to be happy with typing commands in an xterm-equivalent. (I can type faster than I can click, and most of the interesting programs I know how to use, like sed, grep, find, are all traditional Unix command line stuff anyway. )

I'm lazy, and I haven't compiled my own kernels for a couple years now. I use Ubuntu's kernels, and they don't include reiser4. I could compile my own, of course, but I partly want to do QA testing for Ubuntu.

It's a shame that the reiser4 project isn't getting much development funding, compared to other FSes, but it's hardly the only case of interesting/good stuff dying because Linux's corporate backers have their own plans. See http://sabi.co.uk/blog/anno06-3rd.html#060702. I've been reading Sabi's blog quite a bit, and he frequently points out how GNU/Linux is suffering from Microsoft cultural hegemony, and occasional land-grabs by kernel devs. e.g. GNU/Linux has grown a few different registry-work-alikes, and more and more things use binary databases instead of editable config files.

XFS is my filesystem of choice these days. Esp. with dual-core CPUs being standard, XFS's scalability starts to come into play. ext3 will interleave two files written at the same time on different CPUs, but XFS's delayed allocation doesn't have that problem. I don't have big-iron nearly as big as this, though:
http://oss.sgi.com/projects/xfs/papers/ols2006/ols-2006-presentation.pdf
and someone's blog about it:
http://sabi.co.uk/blog/anno06-4th.html#061022

edit: forgot to mention this:
XFS has survived lots of crashes (e.g. when testing out Mesa's i965 DRI driver...), but I very rarely have power interruptions, thanks to a UPS. So I run with write-caches enabled and no write barriers (-o nobarrier). Actually, I use LVM2, and device-mapper doesn't support barriers, so I don't explicitly say nobarrier. On a read-mostly filesystem, like the root or /usr fs, where corruption is a bigger headache (although catchable with debsums), I would enable write barriers. Note that JFS doesn't support write barriers, but ext3 does. http://sabi.co.uk/Notes/linuxFS.html#fsFeats Anyone know about any others?

BTW, I was hoping someone would reply to my brain-dump about XFS I posted previously, before it got pushed off the last page by reiser4 consiracy theories (and blaming things on Jewish people? WTF :mad:, there's no call for that. If there are specific Jewish people you think are doing bad things, that's fine. There are lots of Jews who are not nice people, same for most groups. But talking about them as "the Jews" is not sensible, IMHO. Isn't this board moderated, or is that only for first posts from new users. I guess I'll find out, since this'll be my second.) It's probably not a conspiracy, just Linux corporations spending their money on development of what they think will be in their best interest. And unfortunately that's not reiser4.

I spent maybe a couple hours writing http://www.phoronix.com/forums/showpost.php?p=54107&postcount=46. I should probably put it somewhere instead of leaving it buried in this forum, but still, I was hoping someone would read it... :(

ferreira
12-08-2008, 02:48 PM
@llama
I find your information about XFS most insightful, I'm waiting for the first occasion to try out your tweaks :)
I had problems with XFS with earlier kernels (2.6.24,2.6.25) I've experienced a few nasty data corruptions (one almost broke my system -_-), but from 2.6.26 onward, everything is well, even with nobarrier... it's rock solid now, and I'm really happy if I can safely enhance its performance.

@Jade
Please, use any other color than red... it's totally unreadable on my CRT monitor...

Kirurgs
12-08-2008, 04:45 PM
Still, no matter what Jade actually prays for (and others shoud at him for this), proper filesytem tests is a really good idea.
So, what I wanted to see there is:
ext3, ext4, jfs, xfs, reiserfs, reiser4.
And please don't skip reisers as they are fast. We need to test what's out there w/o any exception!
Tests might include all Phoronix tests and some SVN checkout deletion (like wine or smth big really), compressing a lot of small files, the same with large and so.
I'm really looking forward to see some objective tests!

Jade
12-10-2008, 05:31 PM
So, what I wanted to see there is:
ext3, ext4, jfs, xfs, reiserfs, reiser4.
No chance of ever seeing Reiser4.

energyman
12-10-2008, 11:20 PM
And I wish, Phoronix would test fair.
Leaving everything on default is not fair. ext3 cheats.

Jade
12-12-2008, 07:31 AM
And I wish, Phoronix would test fair.
Leaving everything on default is not fair. ext3 cheats.
Why do you think Hans Reiser was so hated by some?

The stated reasons are total crap. What was the real reason?

Solitary
12-15-2008, 11:16 AM
It would be fun to see how linux filesystems compare against FAT32. Seems like FAT32 is the lowest common denominator of filesystem, as it comes whith every kind of flash-memory, mp3-player, usb-hdd.

energyman
12-15-2008, 11:17 AM
fat is very, very slow.

jadeisalooney
12-15-2008, 06:19 PM
Why do you think Hans Reiser was so hated by some?

The stated reasons are total crap. What was the real reason?

Maybe because he killed his wife?

energyman
12-15-2008, 06:24 PM
you are either trying to joke and fail miserably or you are unable to use a calendar. Choose your poison.

thekittster
12-17-2008, 03:07 AM
So, from time to time, the raid system will check (tunefs could avoid this, but for safety reasons we perform the complete disk check) the data. This needs hours where you just can wait and wait ....

So, if ext4 would reduce this checking time, i would immediatley change.

lvcheck (http://www.redhat.com/archives/linux-lvm/2008-April/msg00088.html) allows background checks to be performed on ext3 volumes on an LVM system (using snapshots). If the checks reveal that the filesystem is healthy, its last check date is updated; this avoids unscheduled fscks on boot. This can be done say at night during weekends (depending on your backup schedule of course :)). The script also works with XFS and JFS, but I haven't used it on these so I daren't comment on the specifics.

The only downside is that orphaned inodes are not cleared, so scheduled fscks are still useful from time to time.

thacrazze
01-02-2009, 04:30 AM
Why you forgot JFS, its the fastet file System for Linux?

thacrazze
01-02-2009, 04:33 AM
And load times in Games will be more interestant as the fps count.

energyman
01-02-2009, 11:57 AM
it is only 'the fastest' when you forget to turn off barriers for xfs and reiserfs. jfs does not even support this critical feature.

thacrazze
01-02-2009, 04:20 PM
it is only 'the fastest' when you forget to turn off barriers for xfs and reiserfs. jfs does not even support this critical feature.

I don't have in mind, that I ever needed this feature.

energyman
01-02-2009, 04:28 PM
do your harddisk have an onboard cache? do you care about your data?
then you need barriers.
xfs and reiserfs care about data, so they turn it on by default
extX devs don't care about your data, so they turn it off by default (argued with '30% performance loss')
jfs doesn't have barrier support at all from all I could find.

RealNC
01-02-2009, 04:43 PM
do your harddisk have an onboard cache? do you care about your data?
then you need barriers.
xfs and reiserfs care about data, so they turn it on by default
extX devs don't care about your data, so they turn it off by default

It's on by default here (ext4).

energyman
01-02-2009, 04:51 PM
then they went away from stupid with ext4 - I applaud them for that decision.

thacrazze
01-02-2009, 04:51 PM
Sorry but I don't need the useless "barriers" and I'm happy with the fastet FS ever :)

JFS JFS JFS

energyman
01-02-2009, 05:00 PM
barriers aren't useless - and jfs has a long history of slowness.

deanjo
01-02-2009, 06:50 PM
Sorry but I don't need the useless "barriers" and I'm happy with the fastet FS ever :)

JFS JFS JFS

Fastest? That's like saying a slower video card is faster if you crank the details down compared to faster card running full eyecandy. If you disable barriers on XFS it stomps on JFS.

energyman
01-04-2009, 04:38 AM
http://www.jejik.com/articles/2008/04/benchmarking_linux_filesystems_on_software_raid_1/

see for yourself. Not even is jfs missing features. It is dead slow.

alec
01-13-2009, 05:54 AM
If I use ext4 with extents, there is no way to read it from Windows, correct?

I have a media partition in ext3, works out well (considering I have no need to write anything, just occasional read).

mahuyar
01-13-2009, 08:11 AM
If I use ext4 with extents, there is no way to read it from Windows, correct?

I have a media partition in ext3, works out well (considering I have no need to write anything, just occasional read).

I think it's too early for that. Give it some time, they'll come up with something... as usual :)... The changes between ext3 and ext4 are drastic.

alec
01-13-2009, 08:33 AM
By the time "they " come up with anything, windows will not be relevant for me.
In fact, word on street is that ext2 plugin will read ext4 just fine, unless it has extents enabled. But extents is the best thing about ext4, so migration is pointless in this case.

mahuyar
01-13-2009, 08:54 PM
By the time "they " come up with anything, windows will not be relevant for me.
In fact, word on street is that ext2 plugin will read ext4 just fine, unless it has extents enabled. But extents is the best thing about ext4, so migration is pointless in this case.

Oh cool... Which program is it? I think there're 2 right?

rv65
01-14-2009, 09:31 PM
How does one install Fedora 10 with ext4.

tytso
02-23-2009, 10:32 PM
Hi there, I just came across this discussion thread about Phoronix's "Real World Benchmarks of the EXT4 filesystem". In answer to the questions about e2fsck speeds, typical results on a filesystem which is created as a native ext4 filesystem is that it is 6-8 times faster at e2fsck speeds compared to ext3. See my blog posting at: http://thunk.org/tytso/blog/2008/08/08/fast-ext4-fsck-times/

Secondly, it should be noted that ext4 has barriers on by default (for safety's sake) while ext3 has barriers off by default (it's actually Andrew Morton who has resisted enabling barriers by default). So when ext4 beats ext3 that's despite the fact that ext3 has an "unfair" advantage over filesystems such as xfs and ext4 which enable barriers by default. You can mount ext3 with the mount option barriers=1 if you want to do a more apples-to-apples comparison.

Finally there are some very good benchmarks available at http://btrfs.boxacle.net, done by a guy who works at IBM doing performance measurements. This site's primary mission is benchmarking in support of btrfs development, but there are also some very good benchmarks that compare ext3, ext4, jfs, xfs, and development versions of btrfs. For example please see:

http://btrfs.boxacle.net/repository/single-disk/Initial-compare/Initial-Compare-Single_disk.html

and

http://btrfs.boxacle.net/repository/raid/Initial-compare/Initial-Compare-RAID0.html