Actually, there's only been one report where using trim caused a disk drive (vendor withheld to protect the guilty, and because I don't know; the distribution which reported this to me had signed an NDA with the vendor) to brick itself. That was probably a case of a firmware bug --- but the problem is that regardless of whether the bug is with the disk drive or not, if anything goes wrong when previously things had been working O.K., they blame the kernel developers.
The bigger problem is that for some SSD's, issuing a large number of TRIM requests actually trashes performance. That's because you have to flush the NCQ queue before you can issue a discard request, thanks to a brain-dead design decisions by the good folks in the T10 standards committee. Hence, a discard costs almost as much as barrier request, and for some SSD's, could actually be more expensive (because they take a long time to process a TRIM request) and so could cause a localized decrease in performance if you happen to have an operation mix that includes file deletes alongside other read/write operations.
The current thinking is that it's better to batch discards, and every few hours, issue a FITRIM ioctl request which will cause the disk to send discards on blocks which it knows to be free. This should have less impact than issuing a discard after every single file delete, which what currently happens if you enable the discard mount option in ext4. The FITRIM ioctl is in the latest kernels, and the userspace daemon will be coming soon. (It's posted on LKML, but I doubt any distro's have packaged it yet.)
In all likelihood, enabling discard for a file system probably won't help the benchmark a whole lot, since the performance advantage of using TRIM is a long-term advantage; and if the file system has been fully TRIM'ed at mkfs time, it's unlikely that the benchmark will have done enough writes that the SSD performance will degrade during the benchmark run. In fact, if the SSD takes time to process TRIM requests, you might actually get better performance by disabling the TRIM requests, just as you will get better short-term performance if you disable the nilfs2's log cleaner. (Long-term it will hurt you badly, but often benchmarks don't test long-term results; that's my concern about benchmarks that don't pre-age the filesystem before beginning its benchmark run.)
Very true. It's worse because we don't have technical writers at our disposal, so we don't always have time to write detailed memos describing how best to optimize your workload. I wish we did, and that's largely on us. But if people are willing to help out on http://ext4.wiki.kernel.org, please let me know. It needs a lot of love.I've argued for on similar points previously on these forums as well as in QEMU/KVM as well. A blazingly fast SQLite result will usually imply that sync operations are being ignored, which puts risk to the data when used for other loads. In the QEMU/KVM issue I chased down, it was true that barriers were being dropped in qemu block layer. (That was 3 weeks of fingerpointing between projects I don't want to relive).
So until the maintainers of the filesystem want to enable a performance optimization by default, you need to be _really_ careful with it. If they even suggest it might be risky, then caveat emptor.
BTW, one time when it might be OK to disable barriers is if you have a UPS that you absolutely trust, and the system is configured to shut itself down cleanly when the UPS reports that its battery is low. Oh, and it might be a good idea to put a big piece of tape (or an ungrounded wire) over the power switch....
(In case it wasn't obvious, the ungrounded wire was a BOFH-style joke; please don't do it in real life. :-)



Reply With Quote
