Archive for August, 2010

Nilfs2: Fedora 13 Follow Up

August 31, 2010

As you might imagine, I have been running with F13 for a while now,
but I was too lazy to update the blog. In any case, I did run the
numbers on the Toshiba. They aren’t pretty:

Nilfs2:

$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=4k count=1024
1024+0 records in
1024+0 records out
4194304 bytes (4.2 MB) copied, 3.6206 s, 1.2 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=4
4+0 records in
4+0 records out
4194304 bytes (4.2 MB) copied, 0.22621 s, 18.5 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB) copied, 15.2122 s, 17.6 MB/s

Ext4:

$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=4k count=1024
1024+0 records in
1024+0 records out
4194304 bytes (4.2 MB) copied, 10.2423 s, 410 kB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=4
4+0 records in
4+0 records out
4194304 bytes (4.2 MB) copied, 0.204465 s, 20.5 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB) copied, 13.1899 s, 20.4 MB/s

I have no idea how to explain Ext4’s awful performance on small
blocks. However, the Intel/Ext4 combo won in just about every
category, so I’m keeping the Toshiba as the backup drive for now.

Interestingly, Nilfs2 performs better on the Toshiba than it does on
Intel. If we combine this with the observation that small block sizes
do relatively poorly on the Toshiba drive, than we have a pretty
convincing case that the Intel drive is doing some magic under the
covers to make it awesome on traditional filesystems, even at the
expense of filesystems specifically designed for SSDs.

In retrospect, I regret not trying out the other obvious filesystem,
btrfs. However that was never really in the running anyway, since I
can’t boot to a btrfs filesystem, even with Grub 2. Would have been
interesting to chart though.

Thus concludes my analysis of Nilfs2 for SSD. I encourage everyone with
an SSD to test, and get your results out there. Hopefully as Grub 2
matures and more distributions than Ubuntu use it, and as more SSDs
make it to the market, we can see what is really the fastest
filesystem out there.

Advertisements

Nilfs2: A File System to Make SSDs Quiet

August 15, 2010

I have been using Nilfs2 ever since I installed Fedora 12, which happened right after it came out. The graphs from Nilfs2: A Filesystem to Make SSDs Scream convinced me I had to have it, because, my ThinkPad X301 actually does have an SSD. Managing to get a Fedora 12 install with a root Nilfs2 filesystem was a bit of a challenge though. If I recall correctly,
I did it by installing on ext3 and then rsyncing everything over, which is a hassle to say the least.

However, I was never sure if I set up the alignment properly, and this article by Ted Tso made me even less sure. Additionally, it made sense to start backing up my system, now that I was getting an error message on every boot:

mount.nilfs2: WARNING! - The NILFS on-disk format may change at any time.
mount.nilfs2: WARNING! - Do not place critical data on a NILFS filesystem.

So I thought I’d try out Jamie Zawinski’s backup strategy since it’s relatively simple and made a lot of sense. I would need a spare drive for the install anyway, and I wasn’t sure that the drive that shipped with my ThinkPad had TRIM, which is supposed to be essential for the lifetime of the drive. Rather than order a mystery drive from Lenovo, I settled on the X18-M. The guys over at NotebookReview helped with that decision.

I did want to make sure I was using the right filesystem for the job. After all, it’s nearly impossible to bitch slap Fedora the right way to get the thing to boot Nilfs2. And Evan Hoffman seems to think there is no benefit whatsoever, although his test, “ghetto” by his own admission, is quite flawed. The obvious flaw is not using sync(). The test is basically returning how long it takes to dirty page cache. The second flaw, pointed out by the comment section, is that the block size is too small. Doesn’t make sense to me since 4k is the filesystem block size and the storage layer should coalesce the I/O operations at the block layer, but it doesn’t hurt to test.

Ext4:

$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=4k count=1024
1024+0 records in
1024+0 records out
4194304 bytes (4.2 MB) copied, 2.64042 s, 1.6 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=4
4+0 records in
4+0 records out
4194304 bytes (4.2 MB) copied, 0.199603 s, 21.0 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB) copied, 11.0575 s, 24.3 MB/s

Nilfs2:

$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=4 count=1024 
1024+0 records in
1024+0 records out
4096 bytes (4.1 kB) copied, 4.20935 s, 1.0 kB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=4
4+0 records in
4+0 records out
4194304 bytes (4.2 MB) copied, 0.257392 s, 16.3 MB/s
$ dd if=/dev/zero of=./zeros.dat oflag=sync bs=1024k count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB) copied, 16.2356 s, 16.5 MB/s

Wow, okay. So he’s right, Nilfs2 is slower for this test. Even though this is kind of a simple benchmark, this disparity probably shows up in other benchmarks like the “extract kernel” benchmark, and in real world use as well. And block size does make quite a difference. Both filesystems performed better with a 1 Meg block size, and here Ext4 beat Nilfs by a higher margin.

So where does that leave us in terms of the original Linux Magazine article? They point to two (old at this point) studies that show that Nilfs2 is pretty good. Chris Samuel’s comprehensive showed test that Nilfs2 was best in class for sequential delete, even for rotational media, and Dongjun Shin from Samsung had some graphs that showed Nilfs2 performance to be off the charts for solid state media. So let’s try to recreate his results:

nilfs2 sucks

Postmark results comparing two filesystems on SSD

Underwhelming, to say the least. So how is it possible for two people to do the exact same thing and get wildly different results? I have a few theories:

Even Linux Magazine admits that the tests they’re basing the entire article on are quite old. The editor relied on previous research for everything, and didn’t even bother doing his own testing. It’s possible that in that time either Nilfs2 got really slow for some reason, or that Ext4 woke up and started kicking ass on SSD, once filesystem developers realized the popularity of that use case.

The second thing is that SSD vendors are getting really good at emulating rotational media. The Intel drives are known to be quite good in this regard, and there is an open question as to weather people should even bother setting up filesystems differently for SSDs. It’s possible that the guys over at Intel benchmarked for traditional filesystems like ext4 (or NTFS), and that the resulting specific and hyper tuned optimization came at a cost to log based filesystems, which theoretically should be better. Maybe Nilfs2 is still better on dumb drives.

I plan on testing both these theories soon after installing Fedora 13, by testing the Toshiba drive in my laptop on the newer Fedora 13 kernel. I’m going to miss continuous snapshotting, but for now it looks like I’ll be using ext4. I’ll catch you on the flip side.

Move from Blogger

August 14, 2010

Hi all! I recently moved from blogger, because blogger now has some AJAX shit that steals control key characters. I don’t know about you, but I need my Emacs key bindings in Firefox or I flip. And because, you know, open source. And because everyone else is doing it.