[Unison-hackers] Memory exhaustion issue (#1068)

Greg Troxel gdt at lexort.com
Sat Nov 23 11:45:47 EST 2024


Michael von Glasow <michael at vonglasow.com> writes:

> I am experiencing memory issues after upgrading the entire OS on my file
> server, and going from Unison 2.52.1 (built by myself) to 2.53.3-2build2
> (from the Ubuntu repos).
>
> The server is a Raspberry Pi 3 with 1 GByte of physical memory, as well
> as 3 GB swap space on a USB disk. It previously ran Raspbian stretch
> (32-bit), now Ubuntu 24.04 (noble, 64-bit).

Quick reactions:

  - you didn't mention what ocaml version.   5.x still seems not 100%
    baked.
  - That's a bunch of changes but should be amenable to bisection
    mostly, with varying effort.

> I was previously able to transfer 16 Gbyte files (possibly even multiple
> ones) using Unison over SSH with no issues. I can also copy files of
> that size using SSH or CIFS, with data transfer rates in the range of
> 5–7.5 Mbyte/s (over a 100 Mbit Ethernet connection).

Are you able to create a repro script?  Can you provoke the issue with
one local and one ssh root, with only one 16 GB file?  Does it happen on
first sync, or on modifications?

> Now, when transferring a 16 GByte file, the first 1% (as indicated by
> the GUI on the client side, also 2.53.3) transfer es expected, at about
> 5 MByte/s. However, I notice memory usage increasing, with Unison
> occupying just above 60% of physical memory.

60% of 1 GB does not seem that vast in absolute terms, but of course
programs should not use memory unnecessarily.

> Having reached about 1%, total real memory usage on the system goes to
> just above 80%, while virtual memory has climbed to 20–25%. At that
> point, the system becomes unusable as it is just busy swapping.

Have you tried binary search on file size?  Does a 1 GB file cause
problems?

> Please let me know what kind of information you would need from me so
> you can investigate this issue further.

It's more hints for you to look into it :-), but

  - I wonder if unison is trying to be too clever in avoiding
    transferring the whole file for a partial update, and is
    computing/retaining too much checksum state (said very vaguely).

  - You have changed from a 32-bit armv7 world to a 64-bit aarch64, so
    long and void * are now twice as big.  It also means that the ocaml
    code generation is using a different back end, which merely means
    "random other things we aren't thinking of could be different".

  - Both your old version and your new version are old.  Please build
    2.53.7 under ocaml 4.14.1 and see how that goes.  I have a vague
    memory of memory usage fixes from 2.53.3 to 2.53.7.  It's not in
    NEWS, but is not really NEWS material.

  - If 2.53.7 is troubled also, you may wish to build 2.52.1 and see if
    that is ok, and watch the memory usage (perhaps a script to ps and
    log for graphing).  And then start to bisect.

I tried syncing a new 16GB file of zeros, from NetBSD 10 amd64 (32GB
RAM) to NetBSD 10 aarch64 (RPI4, 8GB RAM).

at 25% ish on client (amd64)

24709 gdt       85    0    39M   16M select/5    0:39 31.45% 31.45% unison

and server (aarch64)

  PID USERNAME PRI NICE   SIZE   RES STATE       TIME   WCPU    CPU COMMAND
 7912 gdt       78    0    46M   32M select/3    1:17 46.83% 46.83% unison


I don't want to claim these are rock bottom and we couldn't make
improvements, but they don't come across as buggy.


More information about the Unison-hackers mailing list