Syncing large file libraries over the Internet

With the demise of FolderShare, I’ve found it rather difficult to keep my music library synchronized between home and the office. While there is software you can buy like SuperSync, you can pretty easily achieve a more general solution using free software, with the added benefit of universal OS support (e.g. Windows, Mac OS X, and Linux).

First, for connectivity between private LANs across the Internet, Hamachi works very well (and is free for non-commercial use). It’s supported on Windows and OS X, and in beta for Linux and Windows Mobile. Just install the client on each machine, and you’ll have a virtual IP address that you can use to securely tunnel over the Internet to other machines in your mesh using any application. I created a LogMeIn account to use Managed mode which (obviously) makes management easier, but Unmanaged mode works as well.

Next, you can use rsync to efficiently copy and synchronize files through the Hamachi tunnel. It’s usually already installed on Linux and OS X, and can be easily installed on Windows using the Cygwin installer.

While Hamachi makes it easy to browse Windows shares (assuming you’re running Windows) via the tunnel, or even rsync them using a UNC path, the SMB protocol is glacially slow. Therefore, you’ll want to set up the rsync daemon on the machine containing your authoritative version. (While rsync can easily provide two-way synchronization, it is much less risky and mentally taxing to treat one copy as authoritative and access it using a read-only rsync share.) To configure the daemon, you’ll need to create a file called /etc/rsyncd.conf that specifies global and per-library options. On Windows, to share a music library in M:\Music, the file would look like this:

use chroot = false
strict modes = false

[Music]
path = /cygdrive/M/Music
read only = true

Then you can fire up the rsync daemon simply by running rsync --daemon from a shell, command prompt, Run dialog (Windows+R), etc. rsync won’t output anything, but you can verify in your process/task manager that it’s running. If you want the rsync daemon to run automatically as a Windows service, that can be done using cygrunsrv, but you’ll have to refer elsewhere for instructions.

Finally, to actually sync the library to a client, you’ll need to enter a slightly gnarly rsync command (which the difficulty of remembering inspired this blog post) on the client:

rsync -rtOh --chmod=ugo=rwX --ignore-existing --delete --progress rsync://<server IP from Hamachi client>/Music/* /cygdrive/c/Music

This command will copy the everything in the Music module (defined above as M:\Music) to C:\Music, deleting anything in C:\Music that doesn’t exist in M:\Music. Refer to the rsync man page for details on these options and others, but here’s a brief explanation:

  • -r, –recursive: recurse into directories
  • -t, –times: preserve modification times (to aid future synchronization)
  • -O, –omit-dir-times: omit directories from –times
  • -h, –human-readable: output numbers in a human-readable format (we’re humans, right?)
  • –chmod=ugo=rwX: give new files the destination-default permissions (otherwise they’ll end up with screwy permissions)
  • –ignore-existing: skip updating files that already exist on the destination (unless you want things like MP3 tag changes to trigger lots of delta transfers)
  • –delete: delete extraneous files from the receiving side (to avoid accumulating duplicates when files are renamed on the sender)
  • –progress: print information showing the progress of the transfer (since it will probably take a long time to sync your whole media library)

Note: For your first attempt, you’ll probably want to also include the -n option, which executes a dry-run, so you don’t accidentally end up deleting files you didn’t intend to.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.