[Unison-hackers] Orderly shutdown of Unison

Sun Sep 13 13:40:36 EDT 2020

Cross-posting this from the users list, as I'm hoping to get some developer
guidance in order to put together a PR:

I've built a Pacemaker cluster with two nodes in an active/passive
configuration, with the active node running the Unison service.

The service runs continuously with 'watch' enabled, in order to sync files
to another region.

Both the Unison users's home directory and the sync'd files directory are
on shared storage.

During a failover from one node to another, this process is followed.

1. Stop Unison service on old node.
2. Unmount shared storage on old node.
3. Mount shared storage on new node.
4. Start Unison service on new node.

This *mostly* works, but I sometimes get errors when starting Unison on the
new node, mostly to do with corrupted archive files ('end of file
exception').

I suspect this is due to the way the Unison service is stopped on the old
node, which leaves the archive files in a corrupted state.

Given that the Unison process is continuously running with 'watch', is
there some recommended way to stop it such that the archives on both the
sync source and destination are left in a clean state?

Over on the user's list, Benjamin suggested that some code could be added
to Unison to catch a signal that would then shut it down in an orderly way.

I don't mind taking a crack at that, but I have zero experience in the
Unison code base, so it would be helpful if someone could maybe give me
some high-level tips about approach and where to start digging in the code?

Chad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://LISTS.SEAS.UPENN.EDU/pipermail/unison-hackers/attachments/20200913/73ad2061/attachment.htm>