I have been using Nilfs2 ever since I installed Fedora 12, which happened right after it came out. The graphs from Nilfs2: A Filesystem to Make SSDs Scream convinced me I had to have it, because, my ThinkPad X301 actually does have an SSD. Managing to get a Fedora 12 install with a root Nilfs2 filesystem was a bit of a challenge though. If I recall correctly,
I did it by installing on ext3 and then rsyncing everything over, which is a hassle to say the least.
However, I was never sure if I set up the alignment properly, and this article by Ted Tso made me even less sure. Additionally, it made sense to start backing up my system, now that I was getting an error message on every boot:
mount.nilfs2: WARNING! - The NILFS on-disk format may change at any time.
mount.nilfs2: WARNING! - Do not place critical data on a NILFS filesystem.
So I thought I’d try out Jamie Zawinski’s backup strategy since it’s relatively simple and made a lot of sense. I would need a spare drive for the install anyway, and I wasn’t sure that the drive that shipped with my ThinkPad had TRIM, which is supposed to be essential for the lifetime of the drive. Rather than order a mystery drive from Lenovo, I settled on the X18-M. The guys over at NotebookReview helped with that decision.
I did want to make sure I was using the right filesystem for the job. After all, it’s nearly impossible to bitch slap Fedora the right way to get the thing to boot Nilfs2. And Evan Hoffman seems to think there is no benefit whatsoever, although his test, “ghetto” by his own admission, is quite flawed. The obvious flaw is not using sync()
. The test is basically returning how long it takes to dirty page cache. The second flaw, pointed out by the comment section, is that the block size is too small. Doesn’t make sense to me since 4k is the filesystem block size and the storage layer should coalesce the I/O operations at the block layer, but it doesn’t hurt to test.
Ext4:
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=4k count=1024
1024+0 records in
1024+0 records out
4194304 bytes (4.2 MB) copied, 2.64042 s, 1.6 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=4
4+0 records in
4+0 records out
4194304 bytes (4.2 MB) copied, 0.199603 s, 21.0 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB) copied, 11.0575 s, 24.3 MB/s
Nilfs2:
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=4 count=1024
1024+0 records in
1024+0 records out
4096 bytes (4.1 kB) copied, 4.20935 s, 1.0 kB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=4
4+0 records in
4+0 records out
4194304 bytes (4.2 MB) copied, 0.257392 s, 16.3 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB) copied, 16.2356 s, 16.5 MB/s
Wow, okay. So he’s right, Nilfs2 is slower for this test. Even though this is kind of a simple benchmark, this disparity probably shows up in other benchmarks like the “extract kernel” benchmark, and in real world use as well. And block size does make quite a difference. Both filesystems performed better with a 1 Meg block size, and here Ext4 beat Nilfs by a higher margin.
So where does that leave us in terms of the original Linux Magazine article? They point to two (old at this point) studies that show that Nilfs2 is pretty good. Chris Samuel’s comprehensive showed test that Nilfs2 was best in class for sequential delete, even for rotational media, and Dongjun Shin from Samsung had some graphs that showed Nilfs2 performance to be off the charts for solid state media. So let’s try to recreate his results:
Postmark results comparing two filesystems on SSD
Underwhelming, to say the least. So how is it possible for two people to do the exact same thing and get wildly different results? I have a few theories:
Even Linux Magazine admits that the tests they’re basing the entire article on are quite old. The editor relied on previous research for everything, and didn’t even bother doing his own testing. It’s possible that in that time either Nilfs2 got really slow for some reason, or that Ext4 woke up and started kicking ass on SSD, once filesystem developers realized the popularity of that use case.
The second thing is that SSD vendors are getting really good at emulating rotational media. The Intel drives are known to be quite good in this regard, and there is an open question as to weather people should even bother setting up filesystems differently for SSDs. It’s possible that the guys over at Intel benchmarked for traditional filesystems like ext4 (or NTFS), and that the resulting specific and hyper tuned optimization came at a cost to log based filesystems, which theoretically should be better. Maybe Nilfs2 is still better on dumb drives.
I plan on testing both these theories soon after installing Fedora 13, by testing the Toshiba drive in my laptop on the newer Fedora 13 kernel. I’m going to miss continuous snapshotting, but for now it looks like I’ll be using ext4. I’ll catch you on the flip side.