Dev, Manage, Deploy

...and things in between...

0 notes

Using rdiff for small mysqldumps

I’ve been pursuing the idea of using rdiff or something similar to manage sqldumps more effectively. We have a bunch of customers with the need and the requirement that they should store backups for a long time back in time. 

As I like to keep it simple, using this wrapper together with Amazon S3 for storage seems simple and fairly robust. Its still in an early stage, and things might need adjustments, but its kept simple so it shouldnt hopefully be much, needs more testing though.

A quick and simple test shows the followng, on a testdatabase, a full sqldump is 1.3GB, gzipped its around 176MB. See below with an example patch created:

drwxr-xr-x   3 eric eric 4,0K 10 okt 23.30 .
drwxr-xr-x 175 eric eric  12K  8 okt 20.22 ..
-rw-r--r--   1 eric eric 7,6M 10 okt 23.06 full_20111010_230629.signature
-rw-r--r--   1 eric eric 1,3G 10 okt 23.06 full_20111010_230629.sql
-rw-r--r--   1 eric eric   11 10 okt 23.30 patch_20111010_233033_to_full_20111010_230629.rdiff
-rwxr-xr-x   1 eric eric 2,9K 10 okt 22.53 rdiff_mysqldump
-rw-r--r--   1 eric eric   21 10 okt 23.06 .rdiff_mysqldump_lastbase
-rw-r--r--   1 eric eric    2 10 okt 23.30 .rdiff_patches_since_full
-rw-r--r--   1 eric eric 1,1K 10 okt 23.11 README

The rdiff patchfile got 11 bytes on this one, not much, but not really a fair comparison since its not more than 2 hours of changes. Will redo it on a later backup to see the differences better.

Get the source at github, https://github.com/perssontm/rdiff_mysqldump

Good quick examples of rdiff can be found at http://beerpla.net/2008/05/12/a-better-diff-or-what-to-do-when-gnu-diff-runs-out-of-memory-diff-memory-exhausted/ 

I also came across this post, which is an attempt to shorten the locktime when dumping sqlite databases, possible thanks to rdiffs signature, quite interesting, http://ts1-en.blogspot.com/2009/06/backup-sqlite-database-with-rdiff.html