[Unison-hackers] Sneaker Net or Incremental Backup

Duane McKinney duane.mckinney at gmail.com
Tue Jan 13 07:08:09 EST 2009


$ cat sneakernet.sh
#!/bin/sh

# Usage
# $1 = the local root
# $2 = the remote path that unison will pass to rsync
# $3 = the path used for the sneakernet,
#      ie the path to the folder on a usb drive moved between locations
# $4 = Souce of the copy as sent by unison
# $5 = Destination of the copy as sent by unison


# Debuggin stuff
OUTFILE=/dev/null

SOURCEFILE=$4
DESTFILE=`echo $4 | sed "s|$1|$3|"`
DESTDIR=`echo $DESTFILE | sed 's:\(/.*/\).*:\1:'`

echo $SOURCEFILE
echo $DESTFILE
echo $DESTDIR

if [ -s $SOURCEFILE ]
then
         #echo Copying from local
         mkdir -p $DESTDIR >> $OUTFILE
         cp -af $SOURCEFILE $DESTFILE >> $OUTFILE
else
         SOURCEFILE2=`echo $4 | sed "s|$2|$3|"`
         DESTFILE2=$5
         DESTDIR2=`echo $DESTFILE2 | sed 's:\(/.*/\).*:\1:'`

         #echo $SOURCEFILE2
         #echo $DESTFILE2

         if [ -s $SOURCEFILE2 ]
         then
                 mkdir -p $DESTDIR2 >> $OUTFILE
                 cp -af $SOURCEFILE2 $DESTFILE2 >> $OUTFILE
         fi
fi


Benjamin Pierce wrote:
> Hi Duane,
> 
> Could you re-post this with the shell script appended in-line instead  
> of attached?  And/or perhaps just put this explanation in the FAQ wiki
> 
>           http://alliance.seas.upenn.edu/~bcpierce/wiki/index.php
> 
> Thanks!
> 
>      - Benjamin
> 
> 
> On Jan 10, 2009, at 5:05 PM, Duane McKinney wrote:
> 
>> This message has been automatically modified by CETS's antivirus/ 
>> antispam
>> filter. If you have questions about this, please send mail to cets.
>>
>> An attachment named sneakernet.sh was removed from this document as it
>> constituted a security hazard. Send mail to cets if you have  
>> questions about this policy.
>>
>>> That's a nice hack -- I had no idea the new copyprog functionality
>>> could be used that way.
>> That is what I love about linux/unix.  Lots of small programs that  
>> can be combined to do great things.
>>
>>
>> Didn't have quite as much free time as I though, but here are he  
>> results of my initial testing.  It mostly works, and for now works  
>> well enough for me.  I can now cron my syncs and not have to worry  
>> that some large file is going to kill our pipe for a few days :)   
>> I'll report back again in a month or so, or sooner if I run into an  
>> issue and have to change something.
>>
>> I have attached the shell script and the profile I used for testing.  
>> The important parts are:
>> copyprog      =   sneakernet.sh /home/Duane/test  
>> duane at 192.168.113.2:/home/duane/test /home/Duane/usbdrive
>> copyprogrest  =   sneakernet.sh
>>
>> This replaces what would normally be rsync with the attached shell  
>> script.
>>
>> I have only done limited testing so far, but I don't see anything  
>> that could be harmful.
>>
>> To be useful you will need 2 profiles a normal and a sneakernet.  In  
>> your normal profile you should have
>> copyprog      =   rsync --inplace --compress --max-size=xxx
>> copyprogrest  =   rsync --partial --inplace --compress --max-size=xxx
>>
>> Where xxx is the size you have determined is too much to transfer  
>> via network. For me it is 1073741824 = 1GB
>>
>> The sneakernet profile should be identical except for the copyprog  
>> args
>> copyprog      =   sneakernet.sh LocalRoot RemoteRoot SneakerNetPath
>> copyprogrest  =   sneakernet.sh
>> LocalRoot is the path on the local machine you want to sync
>> RemoteRoot is not formatted the same as unison, it is formatted as  
>> passed by unison to rsync
>> SneakerNetPath is the path to a directory on some removable media
>>
>> so now you will have 2 command to get everything synched
>> unison myprofile
>> unison myprofilesneakernet
>>
>> Issues:
>> On copying from the modified machine, unison complains that the copy  
>> did not succeed and moves on, so only one file for each directory  
>> will be copied to the usb drive in a directory that is out of sync.   
>> This is not a big deal for me, but it may be for others.  If it  
>> turns out to be a problem I'll probably dig through the source and  
>> come up with a real solution rather than this hack/workaround.
>>
>> I have not done any testing with copyprogrest.  Since we are using  
>> local disks at all times, I don't know if this will be needed.
>>
>> bcpierce at seas.upenn.edu wrote:
>>> That's a nice hack -- I had no idea the new copyprog functionality   
>>> could be used that way.  Will be interested to hear whether it  
>>> works...
>>>     - Benjamin
>>> Quoting Duane McKinney <duane.mckinney at gmail.com>:
>>>> Or from Version 2.31.4
>>>> copyprog xxx
>>>>     A string giving the name of an external program that can be  
>>>> used to
>>>> copy large files efficiently (plus command-line switches telling  
>>>> it to
>>>> copy files in-place). The default setting invokes rsync with  
>>>> appropriate
>>>> options—most users should not need to change it.
>>>>
>>>> copyprogrest xxx
>>>>     A variant of copyprog that names an external program that  
>>>> should be
>>>> used to continue the transfer of a large file that has already been
>>>> partially transferred. Typically, copyprogrest will just be copyprog
>>>> with one extra option (e.g., –partial, for rsync). The default  
>>>> setting
>>>> invokes rsync with appropriate options—most users should not need to
>>>> change it.
>>>>
>>>> copythreshold n
>>>>     A number indicating above what filesize (in kilobytes) Unison
>>>> should use the external copying utility specified by copyprog.
>>>> Specifying 0 will cause all copies to use the external program; a
>>>> negative number will prevent any files from using it. The default  
>>>> is -1.
>>>> See the Making Unison Faster on Large Files section for more  
>>>> information.
>>>>
>>>> I guess I should have checked out the Beta Documentation 1st.   
>>>> I'll try
>>>> messing around with using a shell script as the copyprog.  I would
>>>> assume that it would work just fine.  My plan is to write a shell  
>>>> script
>>>> which ignores the destination unison tells it, and instead copies  
>>>> it to
>>>> a usb drive.
>>>>
>>>> I'll probably report back on this in a week or so.  Currently I am
>>>> running 2.27, so I have some compiling and testing to do.
>>>>
>>>> Duane McKinney wrote:
>>>>> What about a flag that tells it to instead of executing the  
>>>>> copies, just
>>>>> prints out a list of the files that would be copied?  A proper  
>>>>> name is
>>>>> escaping me at the moment, but it is a common option on programs.
>>>>>
>>>>> Benjamin Pierce wrote:
>>>>>> Both of these features would be easy to implement using  
>>>>>> information
>>>>>> that Unison already has available.  If you want to give it a  
>>>>>> try, I
>>>>>> can tell you where to start.  (I'd be reluctant, though, to add  
>>>>>> this
>>>>>> code to the main sources -- Unison arguably has too many flags and
>>>>>> switches already!)
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>>     - Benjamin
>>>>>>
>>>>>> On Dec 27, 2008, at 10:27 AM, Duane McKinney wrote:
>>>>>>
>>>>>>> I searched, but came up empty.  Can this be done.
>>>>>>> I sync two offices using unison.
>>>>>>> 1) (Optional)I would like to be able to set a preference that  
>>>>>>> says,
>>>>>>> don't try to
>>>>>>> sync a file if it would require X bytes transferred
>>>>>>> 2) Get a list of changed files from a root.  That way I can  
>>>>>>> copy the
>>>>>>> files to a
>>>>>>> removable drive and sync them when I get to the other office.
>>>>>>>
>>>>>>> Here is my reasoning.  Most of the time, root changes are very
>>>>>>> small, a few MB.
>>>>>>> But Let's say that I download a new CENTOS release and place it  
>>>>>>> on
>>>>>>> the file
>>>>>>> server.  This may be a poor example, because I could retrieve it
>>>>>>> over the
>>>>>>> internet again, but just bear with me.  I would like for  
>>>>>>> unison, to
>>>>>>> instead of
>>>>>>> trying to sync this file over the network to skip it.  The I  
>>>>>>> would
>>>>>>> like to be
>>>>>>> able to say once a week, run a job, that would take the  
>>>>>>> differences
>>>>>>> in the root
>>>>>>> and sent them to a USB drive.  I can then carry this drive to
>>>>>>> location 2, and
>>>>>>> update the root there.   Then from my understanding, the network
>>>>>>> sync would
>>>>>>> detect that both roots are identical.
>>>>>>>
>>>>>>> That way, our bandwidth isn't being eaten up for hours/days,  
>>>>>>> trying
>>>>>>> to perform a
>>>>>>> sync that will most likely fail because it will take so long.
>>>>>>>
>>>>>>> Also, this would be more useful, than synchronizing the whole  
>>>>>>> root
>>>>>>> to a usb
>>>>>>> drive, because, the total of the data that I am synchronizing  
>>>>>>> is >
>>>>>>> 1TB.  I would
>>>>>>> not mind moving a few GB (<100) via USB.  I am trying to avoid
>>>>>>> needing either, a
>>>>>>> lot of time, or a bunch of USB drives (one for each root).
>>>>>>>
>>>>>>> I have not yet looked at the source, but I would assume that  
>>>>>>> most of
>>>>>>> the items
>>>>>>> required for this feature are already in place.  Is it  
>>>>>>> feasible?  Is
>>>>>>> there
>>>>>>> already a way?
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Unison-hackers mailing list
>>>>>>> Unison-hackers at lists.seas.upenn.edu
>>>>>>> http://lists.seas.upenn.edu/mailman/listinfo/unison-hackers
>>>> _______________________________________________
>>>> Unison-hackers mailing list
>>>> Unison-hackers at lists.seas.upenn.edu
>>>> http://lists.seas.upenn.edu/mailman/listinfo/unison-hackers
>>>>
>>>>
>>> _______________________________________________
>>> Unison-hackers mailing list
>>> Unison-hackers at lists.seas.upenn.edu
>>> http://lists.seas.upenn.edu/mailman/listinfo/unison-hackers
>> <test.prf>_______________________________________________
>> Unison-hackers mailing list
>> Unison-hackers at lists.seas.upenn.edu
>> http://lists.seas.upenn.edu/mailman/listinfo/unison-hackers



More information about the Unison-hackers mailing list