Some Users Have Been Hitting EXT4 File-System Corruption On Linux 4.19
Adding to the headaches around Linux 4.19 stable is an EXT4 file-system corruption issue that has yet to be sorted out.
Linux 4.19 was already a bit rough due to the hastedly back-ported STIBP code that sharply dropped performance only to be reverted in a later point release. Separately and a still-active issue with Linux 4.19 are multiple users in varying configurations reporting EXT4 file-system corruption problems.
Going back to the middle of November is this still open bug report about EXT4 file-system corruption. Activity has ticked up this week with veteran Linux kernel developer Guenter Roeck also chiming in with EXT4 corruption under Linux 4.19 stable that was not occurring under Linux 4.18.
In fact, Guenter is hitting the problem on at least two systems. For some other users, they can reportedly reproduce this issue reliably on every boot.
There was initially some belief it could have been due to the multi-queue block code (BLK MQ) code in Linux 4.19, but that appears to be ruled out. Unfortunately, EXT4 file-system maintainer Ted Ts'o has been unable to reproduce this corruption issue on his own hardware.
In the aforelinked bug report, today he commented that he doesn't think the corruption issue was introduced in the EXT4 between 4.18 and 4.19, so is asking affected users to test a Linux 4.19 kernel that patches in the EXT4 file-system code from 4.18. It seems Ted's hypothesis right now is that this EXT4 file-system corruption issue is coming from outside of the EXT4 driver code.
I haven't noticed any EXT4 corruption issues on my many systems with Linux 4.19 testing (or 4.20 Git) so I don't have much more to add at this point, so just take this as a word of caution if you are thinking of switching over to Linux 4.19 shortly. But hopefully this issue won't be around much longer given more eyes now looking at (and experiencing) the problem.
Linux 4.19 was already a bit rough due to the hastedly back-ported STIBP code that sharply dropped performance only to be reverted in a later point release. Separately and a still-active issue with Linux 4.19 are multiple users in varying configurations reporting EXT4 file-system corruption problems.
Going back to the middle of November is this still open bug report about EXT4 file-system corruption. Activity has ticked up this week with veteran Linux kernel developer Guenter Roeck also chiming in with EXT4 corruption under Linux 4.19 stable that was not occurring under Linux 4.18.
In fact, Guenter is hitting the problem on at least two systems. For some other users, they can reportedly reproduce this issue reliably on every boot.
There was initially some belief it could have been due to the multi-queue block code (BLK MQ) code in Linux 4.19, but that appears to be ruled out. Unfortunately, EXT4 file-system maintainer Ted Ts'o has been unable to reproduce this corruption issue on his own hardware.
In the aforelinked bug report, today he commented that he doesn't think the corruption issue was introduced in the EXT4 between 4.18 and 4.19, so is asking affected users to test a Linux 4.19 kernel that patches in the EXT4 file-system code from 4.18. It seems Ted's hypothesis right now is that this EXT4 file-system corruption issue is coming from outside of the EXT4 driver code.
I haven't noticed any EXT4 corruption issues on my many systems with Linux 4.19 testing (or 4.20 Git) so I don't have much more to add at this point, so just take this as a word of caution if you are thinking of switching over to Linux 4.19 shortly. But hopefully this issue won't be around much longer given more eyes now looking at (and experiencing) the problem.
79 Comments