Achieving Better Compression with lrzip and rzip

| Comments

I recently upgraded to Mozilla Thunderbird 3.0. That got me thinking that now might be a good time to clean up my local mail folders. All of my mail from the past few years is stored on my IMAP server. I still have a few gigabytes of old mail from my old POP3 days stored in Thunderbird's Local Folders.

I decided that now might be a good time to do some spring cleaning and not carry around my old POP3 mail anymore. I figured it is also a good time to store this current copy of my old mail with my long-term backups.

My Old Friend rzip

I have been using rzip for quite a few years. Its job is to find and encode large chunks of duplicated data over very large distances, up to 900 MB. Once that is complete, it runs the resulting data through bzip2. For large datasets, I've found it to be much faster than `bzip2 and it usually results in an archive that is about 30% smaller than just using bzip2.

Unfortunately, rzip can’t operate on pipes. All of my automated backup scripts run along pipelines, usually from tar to bzip2 to gpg. They never touch the disk unencrypted, which probably isn't always helpful.

My New Friend lrzip

I recently discovered Con Kolivas' lrzip. lrzip takes rzip a couple steps further. It lets you choose a compressor other than `bzip2 for the second stage of compression. It can also be used in a pipe.

Unfortunately, when used in a pipe it generates a large temp file. This can be a problem if you are trying to generate a large archive and don't have a lot of free disk space, or if you don't want unencrypted data being written to the disk.

Some Benchmarks

I compressed the tarball of my .thunderbird directory every way I could think of that made sense. The default settings for lrzip kept erroring out on me at about 30%. I had to use the -w switch to reduce the window size from 20. I chose 12, which should be about 30% higher than rzip's window.

              Size    Minutes    Ratio
              (MB)
uncompressed  5761         na    1.0:1
lrzip zpaq    1207        265    4.7:1
lrzip lzma    1262         60    4.5:1
lrzip bzip2   1401         27    4.1:1
rzip          1362         20    4.2:1
lzma          1441         97    4.0:1
bzip2         1748         38    3.3:1

Both rzip and lrzip achieved a smaller file size in less time than bzip2. lrzip with zpaq is over 13 times slower than rzip for a savings of 155 MB, or about 12%.

Why Would Anyone Wait for zpaq to Finish?

Most of the time it isn't worth the wait. I'm a huge fan of smaller backups. Backups become significantly more expensive every time a single backup has to span a second (or third, or fourth…) piece of media. It's another floppy, CD, DVD, Blu-ray, tape, hard drive, or flash drive to have to manually swap around and keep safe.

I like flash drives for my personal backups. I have too many CDs and DVDs that are unreadable. I've accidentally run an old compact flash drive through the laundry and it still worked. I'm sure all flash drives won't survive that, but they do tend to be very durable.

Unfortunately, lrzip with zpaq did not get the file size down enough for the archive to fit on the backup flash drive that I keep around the house. Another 100 MB or so would have done the trick and would save me quite a bit of effort.

Which One Should You Use?

For most archives I would probably just choose bzip2. It does a very good job, and a decompressor is always very readily available.

For almost every very large archive, I will definitely be sticking with rzip. It is faster and more space-efficient than bzip2. It is also easier to find than lrzip; my Ubuntu machine has an rzip package available in apt.

I will be sure to keep lrzip with zpaq in mind, though. Sometimes an extra couple hundred MB will save the time, effort, and cost of a second piece of media. The other downside to zpaq is that decompression is also very slow as well.

A Quick and Dirty wiper.sh Fix For Intel X25-M

| Comments

Update: I very much doubt that anyone should be using this anymore. The discard mount option has been working properly for years. In fact, you probably should use hdparm to manually TRIM your drive. You should be using fstrim. It is available in the software repositories of most recent Linux distributions.

The wiper.sh script that ships with hdparm 9.27 does not work well with the Intel X25 drives. The call to hdparm will fail if it is passed in more than 512 ranges of sectors. I made a quick and dirty modification to the wiper.sh script so that it makes multiple calls to hdparm with 500 ranges each. I've only been running it for a few days, but it seems to be working just fine so far.

I also added a --yes command line switch so that I could more easily call it from cron.

wonko@zaphod:~$ sudo ./wiper-dangerous.sh --commit --yes /

wiper-dangerous.sh: Linux SATA SSD TRIM utility, version 2.5-dangerous, by Mark Lord.
Preparing for online TRIM of free space on /dev/sda2 (ext4 mounted read-write at /).
Creating temporary file (3091782 KB).. 
Syncing disks.. 
Beginning TRIM operations..

/dev/sda:
trimming 6183568 sectors from 159 ranges
succeeded
Removing temporary file..
Syncing disks.. 
Done.

The script also now has a dependency on Perl. Feel free to download a copy of my wiper-dangerous.sh, but I make no promise that it won't eat your data!

My New Favorite Kind of Tape: Cloth Tape

| Comments

I've been carrying a small roll of duct tape in both my wallet and laptop bag for years. I've been taking advantage of the fact that duct tape is the same width as a business card. You just roll some fresh duct tape around an old business card ten or twenty times and you'll get a small roll of duct tape that fits right in your wallet.

A year or two ago I found a roll of cloth tape in an ancient first aid kit that we had lying around the house. The first thing I noticed was how similar it is to duct tape. It is fabric reinforced like duct tape, but it is more flexible and less sticky.

It does a great job holding cables together, and it doesn't seem to leave behind the nasty mess duct tape leaves behind when you remove it a few months later. Cloth tape also works great for labeling things; it takes Sharpie ink very well.

The width of cloth tape does not line up as well with a business card as duct tape does. I've come up with a better idea, though. I now wrap three kinds of tape around the length of a business card: duct tape, cloth tape, and electrical tape. The three added together are slightly too big for a business card, so there is a tiny bit of overhang.

Three types of tape in my wallet

Using QR Codes for Hardcopy Backups of Private Keys

| Comments

For years I have been keeping a small font printout of my important SSH and GPG private keys hidden and locked away for safe-keeping. I have lost enough floppies, CDs, and DVDs to bit rot, so I do not have much trust in them for the long-term storage of something this important.

The hard copies are nice, but I sure don't want to have to manually type in an error-free copy of my private keys. That is why I now print two copies of each of my keys. One copy is text, the other copy is a QR Code.

QR Codes are popping up all over the place lately. Some of the websites hosting Android software that I use have QR Code images on them, so you can just snap a picture of the code with your smart phone and be taken straight to the website. This one will bring you to this blog:

QR code pointing to patshead.com

An alphanumeric QR Code can contain up to 4,296 characters. This was plenty for my SSH and GPG private keys.

Where Should I Store My Backup Keys?

That would depend on your level of paranoia. You could store them in your safe deposit box at your bank, in a safe at home, under your mattress, or you could bury it in your back yard.

Why Bother Storing a Hard Copy of your Keys?

I am mostly worried about losing my GPG private key. I have gigabytes upon gigabytes of backups and important data encrypted with my key. Completely losing that key would turn all that data into useless bits.

Losing my SSH key(s) wouldn't have nearly as much impact. There would just be a few hosts that I would have trouble logging into. The extra effort to print the SSH key along with the GPG is minimal, though.

Bonnie ++ Benchmarks on the Intel X25-M v1.4 Firmware Update

| Comments

I updated the firmware on my X25-M today. It was a completely pain-free procedure. Just burn a CD, boot from the CD, say yes a few times, and reboot.

I added a new set of Bonnie++ benchmarks to my first set of Intel X25-M benchmarks.

Unfortunately, the new benchmarks aren't a good comparison. I have been using the drive for a month now and I probably have 100 GB worth of rewrites on the drive.

There were two obvious improvements. Sequential input is up 30%, and CPU usage during random seeks is down 12%.

I can't wait to try running another benchmark after I get some TRIM support in my kernel.

Android, K-9 Mail, and IMAP IDLE (Push Email)

| Comments

The single most disappointing application that came installed on my T-Mobile G1 was the default email application. The very first thing I noticed was the horrendous IMAP support. I have a very well organized mail account. I might have a have dozen or so folders at the top of my folder hierarchy. The Android email application just flattens out all of my 100+ sub folders out into a huge, difficult to navigate list.

K-9 Mail

My search for a better mail app lead me to K-9 Mail. K-9 is a fork of the stock Android email application with quite a few improvements.

K-9 still flattens out my folder list but lets me choose which folders I would like to display and sync. Fortunately, I really only want quick access to five or six folders while I'm on the go.

Push Email

The beta release of K-9 supports the IMAP IDLE feature. This allows it to stay connected to the IMAP server so that it can instantly be notified when a new message arrives. This is much better for me than waiting up to 5 minutes to see a new message arrive.

Improved Message List

K-9's message list view is a huge improvement over the stock mail client. The view is much more condensed and looks like it fits about twice as many messages on the screen at the same time.

So far there is only one ChatterEmail feature that I miss. It had a summary view that displayed emails from multiple mailboxes on one screen in chronological order. Each message was color coded so that you could very easily tell which account each message belonged to. I saw a few mentions of ChatterEmail in the K-9 issue tracker, so I know I'm not the only one who misses this feature.

My New Android Phone and Cyanogenmod 4.1.999

| Comments

I finally broke down and replaced my long in the tooth Palm Treo 650 with a T-Mobile G1 (a.k.a. HTC Dream). The only major complaint I had with the Treo was stability. It used to like to reboot itself about once each day…

So far I'm mostly working on replicating all the functionality that my Treo had. Almost everything on the G1 meets or exceeds the capabilities of the software I had on my Treo. Some software has both improvements and regressions compared to the Treo, but overall I'm pretty happy so far.

Cyanogen 4.1.999

I only ran the stock firmware long enough to install the Cyanogen firmware, so I can't talk about all the specific details that are different from the stock firmware.

There are some interesting improvements behind the scenes, though. The Cyanogen firmware uses compcache and Con Kolivas’s BFS process scheduler. I'm already a heavy user of compcache on my laptop, and I've been testing the BFS scheduler as well.

I've really only had one issue so far with the 4.1.999 firmware. Every once in a while all the icons disappear from the launcher and the phone has to be rebooted to get them back. It seems to be a known issue with the experimental firmware. I doubled the size of the compcache swap space and I didn't see the problem for a few days. I can only assume that the race condition is less likely when the phone is more responsive.

Neat Things I'd Like To Do With Android

I would like to get rsync and cron running on it. I think it would be very nice to have automated wireless backups that are always likely to be in my pocket. I'm hoping to be able to do it without having to rely on having Debian on my SD card.

OpenVPN might be interesting, but I don't currently have a real need for it. A few years ago this would have been a top priority for me, though.

Saving Space With fusecompress

| Comments

Update: Fusecompress has been completely unmaintained for quite a long time.

I have been using a fuse compressed file system for a very long time now. Space used to be a bit tight on my old laptop’s 60 GB hard drive, and the space savings haven’t been hurting at all. I have about 4 GB of documentation stored in text, html, and PDF files that I like to carry with me. I also like having it indexed by Gnome Tracker, which means it can't just be sitting in an archive.

I was previously using fuse-zip. It did its job very well and I really liked the fact that it was just mounting plain old standard zip files. It supported writes, but it only committed changes when the file system was unmounted. That was very inconvenient and it took quite a while to recompress 1-2 GB of data when I was only adding a few MB worth of files.

Now I have recently switched to fusecompress. The only thing I dislike is that it does not use a standard archive behind the scenes. I'm using bzip2 as the compression method and I had to tell fusecompress that it was, in fact, OK to attempt to compress PDF files.

Since fusecompress compresses individual files, I am not achieving quite as much compression as I was with fuse-zip, but it is quite close. I am don't mind trading a standard archive format for immediate commits of my writes.

Fix for Karmic Koala 64-bit Flash Plugin

| Comments

I upgraded my laptop to the 64-bit Ubuntu Karmic Koala beta yesterday. For me, Flash was very broken…

The 32-bit Plugin

Ubuntu defaults to installing the 32-bit Adobe Flash plugin and runs it under nspluginwrapper. When I run the plugin this way it will not accept mouse clicks properly. This seems to be a known issue. The bug reports make it sound like disabling compiz fixes this… I don't run compiz, I run the Sawfish window manager. I tested Metacity and it works just fine.

The 64-bit Plugin

Next I tried disabling the 32-bit plugin so that it would use the 64 bit plugin that I already had installed in my .mozilla directory. Mouse clicks worked… Unfortunately, it was crashing constantly. I couldn't even load gmail without crashing Firefox.

Is it Firefox's Fault?

I have both Firefox 3.0 and 3.5 installed. Both were just as crash happy. I was hoping it would be as simple as blaming Firefox 3.5…

Firefox in a Jaunty chroot

I had a 64-bit Jaunty chroot environment already sitting on my hard drive. It only took a few commands and a little waiting before I had Firefox and the 64 bit Flash plugin from Adobe installed in the chroot. It ran perfectly.

I compared the output of ldd libflashplayer.so from inside and outside the chroot. One extra library was showing up in the chroot, libresolv.so.2.

How to Fix It

Just install the libadns1 package:

sudo apt-get install libadns1

That fixed it for me. As always, your mileage may vary!

Screencast: yasnippet for emacs Now Supports Snippet in a Snippet

| Comments

I am pretty excited that yasnippet can now expand a snippet from inside another snippet. I've slowly been tweaking my own snippets to be able to better take advantage of this new feature. I've already gotten most of my Moose snippets working well. I am very happy that I was able to get them to indent pretty well without any manual intervention.

I’ve worked pretty hard to overload my tab key. I am using it for hippie-expand, yas/expand, and indenting. It probably does what I want to over 95% of the time. I do run into two problems with this, though.

I can't call hippie expand from a snippet. I suppose I could have an extra hippie-expand binding, but I don't really like that idea. I may break down and set one up, though, because I haven't thought of a better solution.

Sometimes my tab key calls hippie-expand when I want to indent. This is the one that drives me nuts because I can't predict when it is going to happen. I tried adding an extra indent bind, but I just can't train myself to use it. I've used the tab key for indenting for so many years that the habit has just become too hard to break.

I have my enter key bound to newline-and-indent but I still have to manually hit my indent key sometimes. I've thought about having the enter key also run an indent on the current line. I haven't done it because I don't like the idea of accidentally changing the indentation of existing lines. I don't need to pull indentation changes into version control for no good reason.

How I Configured My Tab Key

I cheated. I tried every trick I could find to bind yas/expand, hippie-expand, and indenting to the same key and none of them were working. I ended up putting yas/hippie-try-expand at the front of my hippie-expand-try-functions-list and indent-for-tab-command at the end.

The only side effect this seems to have is that I get a No expansion found message on a successful indent. That doesn't bother me at all, though.