A serious bug capable of causing data corruption with the Ext4 journalling file system has made it into the Linux 3.4, 3.5 and 3.6 stable kernel ranges, with a patch being worked on currently.
Ts'o found the source code commit causing the problem. The commit was for the Linux 3.6.2 kernel but it was back-ported to the older 3.4 and 3.5 kernels as well.
According to Ts'o, the bug is triggered by rebooting a system in quick succession.
"A full conventional distro install probably wouldn't have triggered a bug... although someone who habitually reboots their laptop instead of using suspend/resume or hiberbate, or someone who is trying to bisect the kernel looking for some other bug could easily trip over this," he noted.
One of the bug reporters on LKML described the serious effects of the bug.
"The bug did really quite a lot of damage to my /home fs in only a few minutes of uptime, given how few files I wrote to it," the bug reporter said.
"What it could have done to a more conventional distro install with everything including
/home on one filesystem, I shudder to think."
Linux developer site LWN.net writes that Ext4 users should avoid kernel versions 3.4.14, 3.4.15, 3.5.7, 3.6.2, and 3.6.3 as these contain the code which can cause the data corruption.
Working through the source tree commits, Ts'o notes that the bug could be 10 years old, going back to kernel 2.4.14 and the Ext3 file system.
Update 28/10: Eric Sandeen of Red Hat told LWN that he has sorted out the problem.
"It appears that the corruption problem everyone was worried about was confined to users who had the non-default journal_checksum option turned on, thus resulting in an unplayable log."