[Unison-hackers] implementing UTF-16 filesystems support

Lucas B. Cohen mli6 at free.fr
Mon Dec 10 03:07:02 EST 2007


Hello hackers,

I've read pretty much every thread on unison-users that contain the keyword
'unicode' or 'utf'. I would really like to get Unison to operate between
Unix and NT machines, and I am willing to spend the necessary time to
achieve this.

I've begun learning rudiments of Objective CAML, and I'm able to understand
about 70% of the statements (phrases?) in Unison's sources, which I've
familiarized myself with.

At this point I'm not quite sure what exactly I would be trying to do. In
July 2004, Jérôme Vouillon mentioned that Unison "should eventually switch
to the Unicode API (or maybe allow to choose between the two APIs)".
But wouldn't getting Unison to access NT filesystems through the Windows
UTF-16 API cause it to behave even worse, not matching any 8-bit ASCII
character read from a UTF-8 encoded Unix filesystem with its corresponding
dual-byte UTF-16 counterpart?

I believe Benjamin Pierce's wish was to keep Unison free from character
encoding issues and rely on the underlying libraries to handle them. But in
such a case, I don't see how Unison could spare some character encoding
awareness.

Thank you for your consideration,

Lucas B. Cohen




More information about the Unison-hackers mailing list