Using rsync to find the size of changed data between two directories

OK, so I searched Google but couldn’t find the magic combination anywhere. Hopefully, this post will help you!

The setup: I wanted to compare the contents of two directories which had previously been synchronized via rsync without actually synchronizing them. The main goal was to find out the total size of the data which would need to be transferred so I could estimate how long the actual rsync run would take. To do this, you’d think the following would work, based on the rsync man pages:

rsync -avvni sourcedir/ destdir/

Broken down that is:

  • -a archive meta-option
  • -vv extra verbosity
  • -n dry run
  • -i itemize changes

The output, however, lists “total size” as the total size of all the files — NOT just the size of the changed files which would be synchronized. So I did some research using the rsync man page and some testing with several options combinations and came up with the following solution:

rsync -an --stats sourcedir/ destdir/

Here’s a mock sample output from running that command:

Number of files: 2
Number of files transferred: 1
Total file size: 4096 bytes
Total transferred file size: 2048 bytes
Literal data: 0 bytes
Matched data: 0 bytes
File list size: 82
File list generation time: 0.013 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 110
Total bytes received: 32
sent 110 bytes  received 32 bytes  284.00 bytes/sec
total size is 4096  speedup is 1.23

The particular stats you’ll need to parse are the following:

  • Total file size: (given in bytes)
  • Total transferred file size: (also in bytes, this is the changed data to be transfered)

You can ignore Total bytes sent and Total bytes received as they only refer to the actual data transferred by the rsync process. In a dry run (-n option) this amounts to only the communication data exchanged by the rsync processes.

Also of interest are the Number of files and Number of files transferred statistics. It is also worth noting that the trailing slashes on the directories are important. If you leave them out, what you are actually testing is the copying of sourcedir to destdir/sourcedir which is probably not what you want to do if you are trying to compare their contents.

If this post was helpful to you, please spread the word and share it with others!