I'm a big fan of GMail. It works, it's free, it's easy to use. I use my GMail account for a bunch of things, but I also have my personal email account, for things I really care about. Why the separation? Well, I don't trust Google enough with all my email. Don't get me wrong, I don't think that Google will do evil things, but in the end, they are a business and things may change at some point. Additionally, I just don't think anyone is going to care about my email as much as I will. So, I set out to come up with a good way to back up GMail for my own use. In a nutshell, I came up with a way that utilizes Cyrus IMAP, in conjunction with imapsync and fslint and offers backup of GMail without storing duplicate emails ... yep single instance storage.
In the end I get a backup of all my email to something I can easily access with any IMAP capable email client and to boot, I don't have to double store emails, just because I applied more than one label.
Without further delay, here is my setup.
For the basics. I'm running this on Ubuntu 9.10 (Karmic). So I start by installing the software.
sudo apt-get install cyrus-imapd-2.2
sudo apt-get install cyrus-admin-2.2
sudo apt-get install slapd
sudo apt-get install imapsync
sudo apt-get install fslint
sudo apt-get install phpldapadmin
sudo apt-get install sasl2-bin
This is optional since you could also just use cyrus with a saslpassword database. I personally prefer LDAP, since I use it in other places.
If you already have LDAP setup, you can just skip over the following steps, just make sure you have a user which will act as cyrus administrator. If you don't read on.
Once LDAP is set up the next step is to create a cyrus admin user and those for the user wanting to backup a GMail account. I favor phpldapadmin for that.
- Login to phpldapadmin by pointing your browser to http://localhost/phpldapadmain/ or whatever your host is and log in with the the LDAP credentials you set up
- If you followed the instructions you should see dc=home,dc=com after logging into phpldapadmin. Click the '+' to the left and it should expand, revealing the uid=admin, which is the admin user
- Beneath it click on the Create new entry here
- Pick the Organizational Unit
- on the next screen just enter people
- Click Create Object
- On the left you should now see your new ou=people, expand it by clicking the '+'
- Click on Create new entry here
- This time select Simple Security Object
- enter cyrus as the user and enter a password
- Click Create Object
- Now click on the same Create new entry hereagain
- Once again select Simple Security Object
- Enter a user name and password for the user planning to backup GMail
- Click Create Object
Now you have the pieces in place for authentication to take place.
Now it's time to setup Cyrus. Do this making sure you have the following three configuration settings in /etc/imapd.conf
admins: cyrus sasl_mech_list: PLAIN sasl_pwcheck_method: saslauthd
The options should already be there, just with the wrong values.
Now restart cyrus by running
The backup script
The next step is to run the backup. I suggest you save the following in a file and edit to you needs.
imapsync \ --ssl1 \ --host1 imap.gmail.com \ --port1 993 \ --user1 GMAIL_USER@gmail.com \ --passfile1 ./pass2.txt \ --split1 100 \ --authmech1 LOGIN \ --host2 localhost \ --user2 IMAP_USER \ --passfile2 ./pass2.txt \ --prefix2 INBOX.GMailBackup. \ --split2 100 \ --ssl2 \ --authmech2 LOGIN \ --regextrans2 's/\[Gmail\]/Gmail/' \ --delete2 \ --syncinternaldates \ --exclude "All Mail|Spam|Trash" \ --allowsizemismatch \ --useheader 'Message-Id' --skipsize
Notice that imapsync will look for passwords for GMail and IMAP in text files. You'll also have to adjust your usernames (both for GMail and IMAP).
Also if you won'd want to put the backup somewhere other than a folder named GMailBackup, you can change that too.
Then make the script executable and run it. Depending on how much email you have it will take a while.
Once you're happy you can create a cron job to run this automatically as needed.
Keep it tidy
With GMail's ability to apply multiple labels to each message, there is the potential that each message would end up being stored more than once. That's just plain untidy. Fortunately, cyrus stores each message as an individual file and so we can take advantage of the UNIX magic of hard links, which allow multiple files to point at only one file.
fslint offers a command line tool to merge duplicate files. What's even better is that it will do this for all duplicate files across all users. The following command will take care of this:
sudo /usr/share/fslint/fslint/findup -m /var/spool/cyrus/mail
This also lends itself to being run via cron.
If all went well, you can now sleep knowing that if Google ever pulls the plug on free email you have your own paranoia copy sitting around.