[Unison-hackers] Memory exhaustion issue (#1068)

Sat Nov 23 15:04:50 EST 2024

On 23/11/2024 19:33, Greg Troxel wrote:
> You might also see if you can reproduce not using a GUI, and without a
> "server" running persistently.   As in produce a shell script anyone can
> run, with just substituting in a remote hostname.

Switching profiles should be sufficient – this created a new unison
process on the server, while the old one gradually freed up its memory
(but kept running).

For now, here’s the results of my test series. I performed sync using
the GUI, keeping it running between sync runs, over an SSH connection.

I created test files with:

dd if=/dev/urandom of=/path/to/testfile bs=1M count=SIZE_IN_MBYTE
status=progress

Each run comprised the following steps:

* Create test file

* Scan (only change being the test file)

* Sync, while monitoring server resources with top (in an SSH session)
and Webmin stats

* Delete file

* Sync again

* Repeat with next test file

Test file sizes were 160M, 1600M, 3200M, 6400M, 12800M and 16000M (in
that order).

Unison on the server stayed running and got reused for each sync. Memory
usage increased near-monotonously.

After the 3200M file, memory usage was at 4.1%. After syncing 6400M, it
climbed to 5.7% (+1.6%). After syncing 12800M, it went to 9.1% (+3.4%).
After syncing 16000M, it went to 14.6% (+5.5%). When reusing the
connection, increases happened only when transferring a new files, never
during rescan, deletion or post-sync scans on the server.

> It may be that your issue is profile size and the big file was just the
> last straw, not the bug.

Looks like profile size is a factor indeed. After scanning the big set
of files, Unison uses 50% of available memory (during scan it oscillated
between just below 45% and just above 60%). The rest of the system took
up around 30%.

However, if I wait for 2h between scan and sync, Unison on the server
frees up most of its memory. If I start the sync, it jumps back to 60%
by the time sync reaches 1%.

So there seem to be two drivers for memory usage: total fileset/archive
size, and size of the individual file. They interact in such a way that
a large file from a small fileset, or a small file from a large fileset,
may sync OK, but a large file from a large fileset will exhaust memory.

Right now, the system is frozen again, with top reporting:

%Cpu(s):  0.5 us, 31.1 sy,  0.0 ni,  0.0 id, 68.4 wa,  0.0 hi, 0.0 si, 
0.0 st
MiB Mem :    899.9 total,     47.7 free,    820.9 used,    132.7 buff/cache
MiB Swap:   2861.0 total,   1907.9 free,    953.1 used.     79.0 avail Mem

The archive file is 50M in size, the whole set of files is somewhere
around 350G. That is somewhat large, but 50% of 1 GB would be 500M, ten
times the size of the file – quite a lot IMO. And for a tool which can
get that memory-hungry, it might be worthwhile to look into ways to
reduce memory usage.

The ticket instructs users to read the wiki for advice on memory usage,
but none of the articles there immediately spring to mind as
memory-related. What article are you referring to? Or what settings are
recommended?

Looking at the docs, what comes to mind is:

- copyprog, copyprogthreshold (use external program <copyprog> for
copying files larger than <copyprogthreshold> kB)

- maxsizethreshold (prevent transfer of files bigger than
<maxsizethreshold> kB)

Would these help (and what should I set them to)? Or are there other
options I should look at?

Or is there a way to tell Unison to stop being smart and just copy the
damn thing (which is presumably less memory-hungry) if a file is larger
than a certain size?