[Unison-hackers] Memory exhaustion issue (#1068)
Greg Troxel
gdt at lexort.com
Sat Nov 23 11:45:47 EST 2024
Michael von Glasow <michael at vonglasow.com> writes:
> I am experiencing memory issues after upgrading the entire OS on my file
> server, and going from Unison 2.52.1 (built by myself) to 2.53.3-2build2
> (from the Ubuntu repos).
>
> The server is a Raspberry Pi 3 with 1 GByte of physical memory, as well
> as 3 GB swap space on a USB disk. It previously ran Raspbian stretch
> (32-bit), now Ubuntu 24.04 (noble, 64-bit).
Quick reactions:
- you didn't mention what ocaml version. 5.x still seems not 100%
baked.
- That's a bunch of changes but should be amenable to bisection
mostly, with varying effort.
> I was previously able to transfer 16 Gbyte files (possibly even multiple
> ones) using Unison over SSH with no issues. I can also copy files of
> that size using SSH or CIFS, with data transfer rates in the range of
> 5–7.5 Mbyte/s (over a 100 Mbit Ethernet connection).
Are you able to create a repro script? Can you provoke the issue with
one local and one ssh root, with only one 16 GB file? Does it happen on
first sync, or on modifications?
> Now, when transferring a 16 GByte file, the first 1% (as indicated by
> the GUI on the client side, also 2.53.3) transfer es expected, at about
> 5 MByte/s. However, I notice memory usage increasing, with Unison
> occupying just above 60% of physical memory.
60% of 1 GB does not seem that vast in absolute terms, but of course
programs should not use memory unnecessarily.
> Having reached about 1%, total real memory usage on the system goes to
> just above 80%, while virtual memory has climbed to 20–25%. At that
> point, the system becomes unusable as it is just busy swapping.
Have you tried binary search on file size? Does a 1 GB file cause
problems?
> Please let me know what kind of information you would need from me so
> you can investigate this issue further.
It's more hints for you to look into it :-), but
- I wonder if unison is trying to be too clever in avoiding
transferring the whole file for a partial update, and is
computing/retaining too much checksum state (said very vaguely).
- You have changed from a 32-bit armv7 world to a 64-bit aarch64, so
long and void * are now twice as big. It also means that the ocaml
code generation is using a different back end, which merely means
"random other things we aren't thinking of could be different".
- Both your old version and your new version are old. Please build
2.53.7 under ocaml 4.14.1 and see how that goes. I have a vague
memory of memory usage fixes from 2.53.3 to 2.53.7. It's not in
NEWS, but is not really NEWS material.
- If 2.53.7 is troubled also, you may wish to build 2.52.1 and see if
that is ok, and watch the memory usage (perhaps a script to ps and
log for graphing). And then start to bisect.
I tried syncing a new 16GB file of zeros, from NetBSD 10 amd64 (32GB
RAM) to NetBSD 10 aarch64 (RPI4, 8GB RAM).
at 25% ish on client (amd64)
24709 gdt 85 0 39M 16M select/5 0:39 31.45% 31.45% unison
and server (aarch64)
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
7912 gdt 78 0 46M 32M select/3 1:17 46.83% 46.83% unison
I don't want to claim these are rock bottom and we couldn't make
improvements, but they don't come across as buggy.
More information about the Unison-hackers
mailing list