GMail backup to IMAP server

I'm a big fan of GMail. It works, it's free, it's easy to use. I use my GMail account for a bunch of things, but I also have my personal email account, for things I really care about. Why the separation? Well, I don't trust Google enough with all my email. Don't get me wrong, I don't think that Google will do evil things, but in the end, they are a business and things may change at some point. Additionally, I just don't think anyone is going to care about my email as much as I will. So, I set out to come up with a good way to back up GMail for my own use. In a nutshell, I came up with a way that utilizes Cyrus IMAP, in conjunction with imapsync and fslint and offers backup of GMail without storing duplicate emails ... yep single instance storage.

In the end I get a backup of all my email to something I can easily access with any IMAP capable email client and to boot, I don't have to double store emails, just because I applied more than one label.

Without further delay, here is my setup.

Software requirements

For the basics. I'm running this on Ubuntu 9.10 (Karmic). So I start by installing the software.

sudo apt-get install cyrus-imapd-2.2
sudo apt-get install cyrus-admin-2.2
sudo apt-get install slapd
sudo apt-get install imapsync
sudo apt-get install fslint
sudo apt-get install phpldapadmin
sudo apt-get install sasl2-bin

Setup LDAP

This is optional since you could also just use cyrus with a saslpassword database. I personally prefer LDAP, since I use it in other places.

If you already have LDAP setup, you can just skip over the following steps, just make sure you have a user which will act as cyrus administrator. If you don't read on.

Getting openldap running on Ubuntu 9.10 (Karmic) takes a little work. I pulled together some the instructions I used in cut and paste form.

Once LDAP is set up the next step is to create a cyrus admin user and those for the user wanting to backup a GMail account. I favor phpldapadmin for that.

  1. Login to phpldapadmin by pointing your browser to http://localhost/phpldapadmain/ or whatever your host is and log in with the the LDAP credentials you set up
  2. If you followed the instructions you should see dc=home,dc=com after logging into phpldapadmin. Click the '+' to the left and it should expand, revealing the uid=admin, which is the admin user
  3. Beneath it click on the Create new entry here
  4. Pick the Organizational Unit
  5. on the next screen just enter people
  6. Click Create Object
  7. On the left you should now see your new ou=people, expand it by clicking the '+'
  8. Click on Create new entry here
  9. This time select Simple Security Object
  10. enter cyrus as the user and enter a password
  11. Click Create Object
  12. Now click on the same Create new entry hereagain
  13. Once again select Simple Security Object
  14. Enter a user name and password for the user planning to backup GMail
  15. Click Create Object

Now you have the pieces in place for authentication to take place.

Setup Cyrus

Now it's time to setup Cyrus. Do this making sure you have the following three configuration settings in /etc/imapd.conf

admins: cyrus
sasl_mech_list: PLAIN
sasl_pwcheck_method: saslauthd

The options should already be there, just with the wrong values.

Now restart cyrus by running /etc/init.d/cyrus2.2 restart

The backup script

The next step is to run the backup. I suggest you save the following in a file and edit to you needs.

imapsync \
        --ssl1 \
        --host1 \
        --port1 993 \
        --user1 \
        --passfile1 ./pass2.txt \
        --split1 100 \
        --authmech1 LOGIN \
        --host2 localhost \
        --user2 IMAP_USER \
        --passfile2 ./pass2.txt \
        --prefix2 INBOX.GMailBackup. \
        --split2 100 \
        --ssl2 \
        --authmech2 LOGIN \
        --regextrans2 's/\[Gmail\]/Gmail/' \
        --delete2 \
        --syncinternaldates \
        --exclude "All Mail|Spam|Trash" \
        --allowsizemismatch \
        --useheader 'Message-Id'  --skipsize

Notice that imapsync will look for passwords for GMail and IMAP in text files. You'll also have to adjust your usernames (both for GMail and IMAP).

Also if you won'd want to put the backup somewhere other than a folder named GMailBackup, you can change that too.

Then make the script executable and run it. Depending on how much email you have it will take a while.

Once you're happy you can create a cron job to run this automatically as needed.

Keep it tidy

With GMail's ability to apply multiple labels to each message, there is the potential that each message would end up being stored more than once. That's just plain untidy. Fortunately, cyrus stores each message as an individual file and so we can take advantage of the UNIX magic of hard links, which allow multiple files to point at only one file.

fslint offers a command line tool to merge duplicate files. What's even better is that it will do this for all duplicate files across all users. The following command will take care of this:

sudo /usr/share/fslint/fslint/findup -m /var/spool/cyrus/mail

This also lends itself to being run via cron.

If all went well, you can now sleep knowing that if Google ever pulls the plug on free email you have your own paranoia copy sitting around.


Great work

Great idea...
Great selection combination of open source tools...
...and it looks like it produces a great outcome.

I expect I'll be implementing a solution very similar to this in the near future. Thanks for the leg-up!