{"id":20,"date":"2010-10-19T14:33:17","date_gmt":"2010-10-19T14:33:17","guid":{"rendered":"https:\/\/kristau.net\/blog\/?p=20"},"modified":"2010-10-19T14:33:17","modified_gmt":"2010-10-19T14:33:17","slug":"using-rsync-to-find-the-size-of-changed-data-between-two-directories","status":"publish","type":"post","link":"https:\/\/kristau.net\/blog\/20\/","title":{"rendered":"Using rsync to find the size of changed data between two directories"},"content":{"rendered":"<p>OK, so I searched Google but couldn\u2019t find the magic combination anywhere. Hopefully, this post will help you!<\/p>\n<p>The setup: I wanted to compare the contents of two directories which had previously been synchronized via rsync without actually synchronizing them. The main goal was to find out the total size of the data which would need to be transferred so I could estimate how long the actual rsync run would take. To do this, you\u2019d think the following would work, based on the rsync man pages:<\/p>\n<pre>rsync -avvni sourcedir\/ destdir\/\n<\/pre>\n<p>Broken down that is:<\/p>\n<ul>\n<li>-a archive meta-option<\/li>\n<li>-vv extra verbosity<\/li>\n<li>-n dry run<\/li>\n<li>-i itemize changes<\/li>\n<\/ul>\n<p>The output, however, lists \u201ctotal size\u201d as the total size of all the files \u2014 <strong><span class=\"caps\">NOT<\/span> just the size of the changed files which would be synchronized<\/strong>. So I did some research using the <a href=\"http:\/\/www.samba.org\/ftp\/rsync\/rsync.html\">rsync man page<\/a> and some testing with several options combinations and came up with the following solution:<\/p>\n<pre>rsync -an --stats sourcedir\/ destdir\/\n<\/pre>\n<p>Here\u2019s a mock sample output from running that command:<\/p>\n<pre>Number of files: 2\nNumber of files transferred: 1\nTotal file size: 4096 bytes\nTotal transferred file size: 2048 bytes\nLiteral data: 0 bytes\nMatched data: 0 bytes\nFile list size: 82\nFile list generation time: 0.013 seconds\nFile list transfer time: 0.000 seconds\nTotal bytes sent: 110\nTotal bytes received: 32\n<\/pre>\n<pre>sent 110 bytes  received 32 bytes  284.00 bytes\/sec\ntotal size is 4096  speedup is 1.23\n<\/pre>\n<p>The particular stats you\u2019ll need to parse are the following:<\/p>\n<ul>\n<li>Total file size: (given in bytes)<\/li>\n<li>Total transferred file size: (also in bytes, this is the changed data to be transfered)<\/li>\n<\/ul>\n<p>You can ignore <em>Total bytes sent<\/em> and <em>Total bytes received<\/em> as they only refer to the actual data transferred by the rsync process. In a dry run (-n option) this amounts to only the communication data exchanged by the rsync processes.<\/p>\n<p>Also of interest are the <em>Number of files<\/em> and <em>Number of files transferred<\/em> statistics. It is also worth noting that the trailing slashes on the directories are important. If you leave them out, what you are actually testing is the copying of <em>sourcedir<\/em> to <em>destdir\/sourcedir<\/em> which is probably not what you want to do if you are trying to compare their contents.<\/p>\n<p>If this post was helpful to you, please spread the word and share it with others!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>OK, so I searched Google but couldn\u2019t find the magic combination anywhere. Hopefully, this post will help you! The setup: I wanted to compare the contents of two directories which had previously been synchronized via rsync without actually synchronizing them. The main goal was to find out the total size of the data which would [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12,7],"tags":[97,165,195],"class_list":["post-20","post","type-post","status-publish","format-standard","hentry","category-linux","category-technology","tag-hints-tips-tricks","tag-rsync","tag-technology-2"],"_links":{"self":[{"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/posts\/20","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/comments?post=20"}],"version-history":[{"count":0,"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/posts\/20\/revisions"}],"wp:attachment":[{"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/media?parent=20"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/categories?post=20"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kristau.net\/blog\/wp-json\/wp\/v2\/tags?post=20"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}