[Unison-hackers] Experiences with unicode support synching Mac-Lin

Stefan Rank list-ener at strank.info
Tue Jul 14 10:58:26 EDT 2009


Hi everybody,

I'd like to provide some feedback to the developers regarding the 
preliminary unicode support in unison trunk.

Short version:

- It works (so far)
- It would be great to automatically convert filenames created on
   a Mac (NFD) to the NFC normal-form used on Linux (by convention)
   when this file is first synched (and therefore created by unison
   on the Linux side).

Long version:

I compiled the recent trunk (-r 368) on Mac (text and macnew ui) and 
Linux (text only).
Compiling the GUI on Mac OS required small patches to not attempt 
compiling a backwards compatible binary for 10.4 as this would fail::

   "MINOSXVERSION=10.5" in src/Makefile.OCaml
   "SDKROOT = /Developer/SDKs/MacOSX10.5.sdk"
   3x in src/uimacnew/uimacnew.xcodeproj/project.pbxproj
   (only one is necessary, see recent emails on the list)

With "unicode = true" in the profile, this now preserves all my 
filenames, even those with the crazy characters.

I still had non-utf8 filenames on Linux (mostly created previously by 
unison).
unison-text will abort with an error message in this case.
unison-gui will fail silently. For some cases, it will show empty 
no-synch lines. (Sorry, I don't remember the details now.)

To get rid of these troublemakers on Linux, the convmv script is very 
helpful: http://www.j3e.de/linux/convmv/
The actual command for my case (recursively in the current dir)::

   convmv -r -f iso-8859-15 -t utf8 --nfc .

(choose an appropriate 'from' encoding, add --notest to actually do it)

Files created on the Mac and transferred to Linux are created there in 
utf-8 NFD. You can then easily/accidentally create an "identical" utf-8 
NFC file. ls will show you two identical filenames.
(For filenames with umlauts/accented characters, tab-completion can help 
to distinguish which file is which since only the composed version of 
the filename will complete if you type the base character and hit tab.)

To convert NFD filenames on Linux to NFC, convmv comes to the rescue again::

   convmv -r -f utf8 -t utf8 --nfc --replace .

(again, add --notest to actually do it)

Ideally, as suggested above, files could be created as NFC on Linux when 
first synched from a Mac.

cheers,
stefan



More information about the Unison-hackers mailing list