[OpenWrt-Devel] overcommit memory/ratio

Wed Sep 24 10:42:34 EDT 2014

On 2014-09-20 03:05, Nikos Mavrogiannopoulos wrote:
> On Fri, 2014-09-19 at 18:39 -0700, David Lang wrote
>
>>> Well being used to something bad, doesn't mean things cannot get better.
>>> Routers (to which I have some experience at), rarely have processes
>>> running that wouldn't matter if they are randomly killed; on a desktop
>>> system you immediately notice an issue, and you can reboot, a router is
>>> typically running unattended. Being locked out of such a system because
>>> another process had a memory leak, can be an issue.
>>
>> Turning off overcommit so that a program that wants to spawn a child will end up
>> requiring double it's memory (for the time between the fork and the exec) is
>> likely to cause programs to fail when they don't need to.
>
> I'd be surprised if fork and exec worked that way.

Why? That is exactly how fork+exec works. How else could it possibly work?

> After a fork the two processes share the same physical pages (see the notes on fork()
> manpage),

Until one of the processes writes to them (e.g.: by assigning to a variable, or by
a function-call or a function-return pushes/pops/modifies a stack-frame); and then
the system to be able to actually use the extra memory.

Copy-on-write is basically a `performance hack'. There are systems that
"don't have an inexpensive fork()"; these notes in the fork(2) man page
are just there to ward off fears that Linux might be one of them:

	NOTES
	       Under Linux, fork() is implemented using copy-on-write  pages,  so  the
	       only  penalty  that it incurs is the time and memory required to dupli‐
	       cate the parent's page tables, and to create a  unique  task  structure
	       for the child.

People who aren't familiar with the copy-on-write semantics of fork()
tend to be afraid of creating processes because they think that it's "expensive".

>  and overcommit applies to physical ram, not virtual.

Overcommit is the _difference_ between what you're calling "physical ram"
and the total allocations that have been `committed to'.

When you fork, you are in fact committing to support 2x the memory-footprint
of the process being forked; the system doesn't know at the time of fork()
whether the child is going to exec or whether the child (or the parent!)
is going to write to all of the pages and force the copy-on-write pages
to turn into actual copies. vfork(), on the other hand, sets a child up with
a shared memoryspace that _is_ expected to be immediately replaced by an exec....

I've seen exactly the failure that David describes; it's pretty easy
to reproduce: just disable overcommit, create a process that allocates > 50%
of free memory, and then have it fork. You can do it on your desktop by,
for example, running a Python REPL and typing in "l = ['x']*HUGENUMBER; os.fork()";
or by creating a sufficiently large number of threads (since each one
gets something like 2 MB for stackspace by default) and then forking
(you may want to "swapoff" first, so that it fails _quickly_).

-- 
"'tis an ill wind that blows no minds."
_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel