From bcpierce at cis.upenn.edu Fri Dec 9 14:48:49 2005 From: bcpierce at cis.upenn.edu (bcpierce@cis.upenn.edu) Date: Fri, 9 Dec 2005 14:48:49 -0500 Subject: [Unison-hackers] [unison-svn] r110 - trunk/src Message-ID: <200512091948.jB9Jmnwb021625@canfield.cis.upenn.edu> Author: bcpierce Date: 2005-12-09 14:48:46 -0500 (Fri, 09 Dec 2005) New Revision: 110 Modified: trunk/src/RECENTNEWS trunk/src/copy.ml trunk/src/globals.ml trunk/src/globals.mli trunk/src/mkProjectInfo.ml trunk/src/transport.ml trunk/src/uitext.ml trunk/src/update.ml trunk/src/uutil.ml trunk/src/uutil.mli Log: * Added some status messages recording when the xferbycopying hint is actually used * Added version number to log header * The confirmation of "big deletes" is now controlled by a flag, -confirmbigdeletes. Default is true, which gives the same behavior as previously. From bcpierce at cis.upenn.edu Fri Dec 9 14:50:10 2005 From: bcpierce at cis.upenn.edu (bcpierce@cis.upenn.edu) Date: Fri, 9 Dec 2005 14:50:10 -0500 Subject: [Unison-hackers] [unison-svn] r111 - trunk/src Message-ID: <200512091950.jB9JoAk0021649@canfield.cis.upenn.edu> Author: bcpierce Date: 2005-12-09 14:50:09 -0500 (Fri, 09 Dec 2005) New Revision: 111 Modified: trunk/src/RECENTNEWS trunk/src/mkProjectInfo.ml Log: * Bump minor version number From geoffw at cis.upenn.edu Mon Dec 12 08:17:29 2005 From: geoffw at cis.upenn.edu (Geoffrey Alan Washburn) Date: Mon, 12 Dec 2005 08:17:29 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? Message-ID: I've been using rsync for uni-directional backups for a while, but really it has been rather suboptimal because it seems to hold quite a bit of data in memory when processing a large disk with lots of files. Has anyone done any comparisons with how unison performs on uni-directional synchronization memory-wise? I'm not too concerned about speed, but rsync chews up so much memory that it causes my iBook to thrash until finished. Of course, unison doesn't handle extended attributes yet, but I'm working on that. From bcpierce at cis.upenn.edu Mon Dec 12 12:06:18 2005 From: bcpierce at cis.upenn.edu (Benjamin Pierce) Date: Mon, 12 Dec 2005 12:06:18 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: References: Message-ID: I regularly synchronize 120Gb disks with, among other things, lots and lots of mail files. Update detection is a bit slower than I'd like because of all the inodes to be scanned, but not too bad. Others have reported success with much larger replicas. Unison's overall memory usage scales more or less linearly with the number of files -- but with a fairly small constant factor. Its peak memory usage scales linearly with the size of the largest individual file. There have been some reports of problems with really gigantic (multi-gig) individual files. I don't know whether the current rsync implementation calculates block fingerprints only within each file (as Unison does), or across a whole filesystem. If it's just per file, then the memory usage may be comparable, since the core algorithms are very similar. Regards, - Benjamin On Dec 12, 2005, at 8:17 AM, Geoffrey Alan Washburn wrote: > I've been using rsync for uni-directional backups for a while, but > really it has been rather suboptimal because it seems to hold quite a > bit of data in memory when processing a large disk with lots of files. > Has anyone done any comparisons with how unison performs on > uni-directional synchronization memory-wise? I'm not too concerned > about speed, but rsync chews up so much memory that it causes my iBook > to thrash until finished. > > Of course, unison doesn't handle extended attributes yet, but I'm > working on that. > > _______________________________________________ > Unison-hackers mailing list > Unison-hackers at lists.seas.upenn.edu > http://lists.seas.upenn.edu/mailman/listinfo/unison-hackers From geoffw at cis.upenn.edu Mon Dec 12 13:01:34 2005 From: geoffw at cis.upenn.edu (Geoffrey Alan Washburn) Date: Mon, 12 Dec 2005 13:01:34 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: References: Message-ID: <439DBAFE.9060203@cis.upenn.edu> Benjamin Pierce wrote: > I regularly synchronize 120Gb disks with, among other things, lots > and lots of mail files. Update detection is a bit slower than I'd > like because of all the inodes to be scanned, but not too bad. > Others have reported success with much larger replicas. > > Unison's overall memory usage scales more or less linearly with the > number of files -- but with a fairly small constant factor. Its peak > memory usage scales linearly with the size of the largest individual > file. There have been some reports of problems with really gigantic > (multi-gig) individual files. > I don't know whether the current rsync implementation calculates > block fingerprints only within each file (as Unison does), or across > a whole filesystem. If it's just per file, then the memory usage may > be comparable, since the core algorithms are very similar. Well, there is the "rsync transfer algorithm" and there is the rsync tool, and I'm pretty sure the problem is in the tool and not the core algorithm. When I use the rsync tool to do a backup, scans the entire disk first and then goes about transferring deleting/transferring. At that point the rsync process has begun to consume a few hundred megabytes. As far as I've been able to tell there is no command-line switch to make it not scan the entire disk first (for whatever it is doing). So I guess maybe a better question to ask, is when synchronizing those 120GB disks, have you observed how much memory unison uses at a maximum? The only real problem at the moment is that Apple's version of rsync understands extended attributes, and Unison doesn't, but I'm hoping to just sit down soon and finish writing my OCaml extended attributes library. At that point I'm hoping someone with more experience with the Unison internals might be able to use the library. From geoffw at cis.upenn.edu Mon Dec 12 13:01:34 2005 From: geoffw at cis.upenn.edu (Geoffrey Alan Washburn) Date: Mon, 12 Dec 2005 13:01:34 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: References: Message-ID: <439DBAFE.9060203@cis.upenn.edu> Benjamin Pierce wrote: > I regularly synchronize 120Gb disks with, among other things, lots > and lots of mail files. Update detection is a bit slower than I'd > like because of all the inodes to be scanned, but not too bad. > Others have reported success with much larger replicas. > > Unison's overall memory usage scales more or less linearly with the > number of files -- but with a fairly small constant factor. Its peak > memory usage scales linearly with the size of the largest individual > file. There have been some reports of problems with really gigantic > (multi-gig) individual files. > I don't know whether the current rsync implementation calculates > block fingerprints only within each file (as Unison does), or across > a whole filesystem. If it's just per file, then the memory usage may > be comparable, since the core algorithms are very similar. Well, there is the "rsync transfer algorithm" and there is the rsync tool, and I'm pretty sure the problem is in the tool and not the core algorithm. When I use the rsync tool to do a backup, scans the entire disk first and then goes about transferring deleting/transferring. At that point the rsync process has begun to consume a few hundred megabytes. As far as I've been able to tell there is no command-line switch to make it not scan the entire disk first (for whatever it is doing). So I guess maybe a better question to ask, is when synchronizing those 120GB disks, have you observed how much memory unison uses at a maximum? The only real problem at the moment is that Apple's version of rsync understands extended attributes, and Unison doesn't, but I'm hoping to just sit down soon and finish writing my OCaml extended attributes library. At that point I'm hoping someone with more experience with the Unison internals might be able to use the library. From bcpierce at cis.upenn.edu Mon Dec 12 13:30:23 2005 From: bcpierce at cis.upenn.edu (Benjamin Pierce) Date: Mon, 12 Dec 2005 13:30:23 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: <439DBAFE.9060203@cis.upenn.edu> References: <439DBAFE.9060203@cis.upenn.edu> Message-ID: <439DC1BF.8070705@cis.upenn.edu> > So I guess maybe a better question to ask, is when synchronizing those > 120GB disks, have you observed how much memory unison uses at a maximum? No. What's the easiest way to do this on OSX? (But actually I'm not the person to ask, necessarily, since my syncs tend to be fairly modest in the amount of changed data and my largest individual files are not all that huge. Perhaps some power users can also post some numbers.) - Benjamin From geoffw at cis.upenn.edu Mon Dec 12 14:12:46 2005 From: geoffw at cis.upenn.edu (Geoffrey Alan Washburn) Date: Mon, 12 Dec 2005 14:12:46 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: <439DC1BF.8070705@cis.upenn.edu> References: <439DBAFE.9060203@cis.upenn.edu> <439DC1BF.8070705@cis.upenn.edu> Message-ID: <439DCBAE.6030607@cis.upenn.edu> Benjamin Pierce wrote: >> So I guess maybe a better question to ask, is when synchronizing those >> 120GB disks, have you observed how much memory unison uses at a maximum? > > No. What's the easiest way to do this on OSX? I think the program Activity Monitor.app will at least tell you how much memory it is using at a given moment, as will "top" at the command-line. There are probably tools to "graph" memory usage of a specific process, but I'm not sure whether Activity Monitor.app will do this (I don't have a Mac handy at the moment) and I don't know what the appropriate command-line tools for this might be. > (But actually I'm not > the person to ask, necessarily, since my syncs tend to be fairly modest > in the amount of changed data and my largest individual files are not > all that huge. Perhaps some power users can also post some numbers.) That's the thing, I'd like to think my needs are not particularly aggressive or unusual, generally it isn't that much data that is changing when I do a backup, probably less than a ten or twenty megabytes (that's because I don't bother to exclude things like my browser cache) unless perhaps I just downloaded a set of photos from my camera. From geoffw at cis.upenn.edu Mon Dec 12 14:12:46 2005 From: geoffw at cis.upenn.edu (Geoffrey Alan Washburn) Date: Mon, 12 Dec 2005 14:12:46 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: <439DC1BF.8070705@cis.upenn.edu> References: <439DBAFE.9060203@cis.upenn.edu> <439DC1BF.8070705@cis.upenn.edu> Message-ID: <439DCBAE.6030607@cis.upenn.edu> Benjamin Pierce wrote: >> So I guess maybe a better question to ask, is when synchronizing those >> 120GB disks, have you observed how much memory unison uses at a maximum? > > No. What's the easiest way to do this on OSX? I think the program Activity Monitor.app will at least tell you how much memory it is using at a given moment, as will "top" at the command-line. There are probably tools to "graph" memory usage of a specific process, but I'm not sure whether Activity Monitor.app will do this (I don't have a Mac handy at the moment) and I don't know what the appropriate command-line tools for this might be. > (But actually I'm not > the person to ask, necessarily, since my syncs tend to be fairly modest > in the amount of changed data and my largest individual files are not > all that huge. Perhaps some power users can also post some numbers.) That's the thing, I'd like to think my needs are not particularly aggressive or unusual, generally it isn't that much data that is changing when I do a backup, probably less than a ten or twenty megabytes (that's because I don't bother to exclude things like my browser cache) unless perhaps I just downloaded a set of photos from my camera. From wmertens.spm at advalvas.be Tue Dec 13 12:29:16 2005 From: wmertens.spm at advalvas.be (Wout Mertens) Date: Tue, 13 Dec 2005 18:29:16 +0100 Subject: [Unison-hackers] mac patches In-Reply-To: References: Message-ID: <129B06BA-E2D4-494D-A5D8-14B8ABE95449@advalvas.be> The way I usually hack around this situation is by using the SSH_ASKPASS environment variable. If you have that and DISPLAY set, ssh will run the program in SSH_ASKPASS to request the password. This way, you have programmatic control over the password requesting, instead of having to handle every tiny variation of the password/ phrase requests. You can cache the password for subsequent connects, or even automatically install a public key on the remote server with a separate ssh connection. I have example scripts for these situations. Cheers, Wout. On 18 Sep 2005, at 16:16, Ben Willmore wrote: > Hi, > > Here are 3 patches against current SVN unison that fix the following > problems with the mac UI. > > 1. tiger-ssh-prompt-patch.20050916 > To work with a variety of systems, terminal.ml needs to recognize > _both_ 'Password:' and 'Password: '. This patch provides an > appropriate regex. [This was wrongly fixed between 2.13.16 and > current SVN.] > > 2. uimac-passphrase-prompt-nobin.patch.20050916 > Extends password UI so it will request an ssh passphrase if necessary, > allowing use of passphrase-protected ssh keys. Note this only patches > the ascii files -- for it to work, you need to also alter the .nib > file: Make a new outlet for MyController, called passwordPrompt, of > type NSTextField. Connect this to the text 'Please enter your > password' in the Password window. > > 3. uimac-password-sheet.patch.20050916 > Fixes bug where providing a profile name on the command line confuses > the password window (it is detached from the main window), and leads > to a crash on a second sync. > > One last thing -- the UI opens halfway off-screen for me in the > current SVN version. > > Cheers > > Ben > > > > _______________________________________________ > Unison-hackers mailing list > Unison-hackers at lists.seas.upenn.edu > http://lists.seas.upenn.edu/mailman/listinfo/unison-hackers From wmertens.spm at advalvas.be Tue Dec 13 12:46:26 2005 From: wmertens.spm at advalvas.be (Wout Mertens) Date: Tue, 13 Dec 2005 18:46:26 +0100 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: <439DBAFE.9060203@cis.upenn.edu> References: <439DBAFE.9060203@cis.upenn.edu> Message-ID: <80A182D9-BA10-495C-A037-545DA50C4520@advalvas.be> On 12 Dec 2005, at 19:01, Geoffrey Alan Washburn wrote: > Well, there is the "rsync transfer algorithm" and there is the rsync > tool, and I'm pretty sure the problem is in the tool and not the core > algorithm. When I use the rsync tool to do a backup, scans the entire > disk first and then goes about transferring deleting/transferring. At > that point the rsync process has begun to consume a few hundred > megabytes. As far as I've been able to tell there is no command-line > switch to make it not scan the entire disk first (for whatever it is > doing). You can always do the rsync a directory at a time. E.g. if you had # rsync -a / target:/backup instead do # find / -mindepth 1 -maxdepth 1 -exec rsync -aR {} target:/ backup \; or even # find / \! -type d -mindepth 1 -maxdepth 1 -exec rsync -aR {} target:/backup \; # find / -mindepth 2 -maxdepth 2 -exec rsync -aR {} target:/ backup \; Wout. From geoffw at cis.upenn.edu Tue Dec 13 13:13:25 2005 From: geoffw at cis.upenn.edu (Geoffrey Alan Washburn) Date: Tue, 13 Dec 2005 13:13:25 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: <80A182D9-BA10-495C-A037-545DA50C4520@advalvas.be> References: <439DBAFE.9060203@cis.upenn.edu> <80A182D9-BA10-495C-A037-545DA50C4520@advalvas.be> Message-ID: <439F0F45.9040904@cis.upenn.edu> Wout Mertens wrote: > You can always do the rsync a directory at a time. E.g. if you had > > # rsync -a / target:/backup > > instead do > > # find / -mindepth 1 -maxdepth 1 -exec rsync -aR {} target:/ > backup \; > > or even > > # find / \! -type d -mindepth 1 -maxdepth 1 -exec rsync -aR {} > target:/backup \; > # find / -mindepth 2 -maxdepth 2 -exec rsync -aR {} target:/ > backup \; Ah, not a bad idea. I'll have to experiment with that a little. Thanks! From geoffw at cis.upenn.edu Tue Dec 13 13:13:25 2005 From: geoffw at cis.upenn.edu (Geoffrey Alan Washburn) Date: Tue, 13 Dec 2005 13:13:25 -0500 Subject: [Unison-hackers] unison more memory efficient than rsync? In-Reply-To: <80A182D9-BA10-495C-A037-545DA50C4520@advalvas.be> References: <439DBAFE.9060203@cis.upenn.edu> <80A182D9-BA10-495C-A037-545DA50C4520@advalvas.be> Message-ID: <439F0F45.9040904@cis.upenn.edu> Wout Mertens wrote: > You can always do the rsync a directory at a time. E.g. if you had > > # rsync -a / target:/backup > > instead do > > # find / -mindepth 1 -maxdepth 1 -exec rsync -aR {} target:/ > backup \; > > or even > > # find / \! -type d -mindepth 1 -maxdepth 1 -exec rsync -aR {} > target:/backup \; > # find / -mindepth 2 -maxdepth 2 -exec rsync -aR {} target:/ > backup \; Ah, not a bad idea. I'll have to experiment with that a little. Thanks!