[PATCH] uloop: wait only on known child processes
Elbert Mai
code at elbertmai.com
Fri Feb 21 04:12:27 PST 2025
Upon catching SIGCHLD, the event loop calls waitpid(-1, ...) and passes
the exit code to the callback associated with the returned PID. However,
waitpid(-1, ...) returns the exit code for any terminated child process,
not just those added to uloop via uloop_process_add().
Once uloop has waited on a child process, subsequent wait calls on that
process will fail with ECHILD. As a result, functions that fork and wait
on their own processes may suffer sporadic and confusing failures when
used together with uloop in an application.
This commit modifies uloop_handle_processes() so it invokes waitpid()
for each PID in the uloop process list, instead of indisciminately
waiting for all child processes. As a side benefit, the function now
runs in O(n) instead of O(n^2). The process list no longer needs to be
sorted, so uloop_process_add() is also simplified.
Signed-off-by: Elbert Mai <code at elbertmai.com>
---
This patch came about as a result of some experimentation with ucode.
Consider this ucode example script:
import * as fs from "fs";
import * as uloop from "uloop";
uloop.init();
let proc1 = fs.popen("false");
let delay = uloop.timer(1000, () => {
let proc2 = fs.popen("false");
printf("proc1: %s\n", proc1.close() ?? "null");
printf("proc2: %s\n", proc2.close() ?? "null");
uloop.end();
});
uloop.run();
uloop.done();
The output I expected, after the 1 second delay:
proc1: 1
proc2: 1
The output I actually get, as of 24.10:
proc1: null
proc2: 1
This happens because under the hood, uloop calls waitpid() with a
PID of -1, which reaps the proc1 process before the timer callback
can obtain its exit code.
I realize that ucode has uloop.process(), but it doesn't allow me to
read/write the child's standard output/input. And even if it does, a
ucode library can fork/exec/wait a child process behind the scenes,
independently of uloop. This can lead to utterly confusing failures.
Initial tests with this patch are promising. But since uloop is used
by many core OpenWrt components, I want to hear from the developers
if this is a sane idea or not.
- Elbert
uloop.c | 32 ++++++--------------------------
1 file changed, 6 insertions(+), 26 deletions(-)
diff --git a/uloop.c b/uloop.c
index da6f690..eedc4aa 100644
--- a/uloop.c
+++ b/uloop.c
@@ -390,20 +390,10 @@ int64_t uloop_timeout_remaining64(struct uloop_timeout *timeout)
int uloop_process_add(struct uloop_process *p)
{
- struct uloop_process *tmp;
- struct list_head *h = &processes;
-
if (p->pending)
return -1;
- list_for_each_entry(tmp, &processes, list) {
- if (tmp->pid > p->pid) {
- h = &tmp->list;
- break;
- }
- }
-
- list_add_tail(&p->list, h);
+ list_add_tail(&p->list, &processes);
p->pending = true;
return 0;
@@ -428,26 +418,16 @@ static void uloop_handle_processes(void)
do_sigchld = false;
- while (1) {
- pid = waitpid(-1, &ret, WNOHANG);
- if (pid < 0 && errno == EINTR)
- continue;
-
- if (pid <= 0)
- return;
-
- list_for_each_entry_safe(p, tmp, &processes, list) {
- if (p->pid < pid)
- continue;
-
- if (p->pid > pid)
- break;
+ list_for_each_entry_safe(p, tmp, &processes, list) {
+ do {
+ pid = waitpid(p->pid, &ret, WNOHANG);
+ } while (pid < 0 && errno == EINTR);
+ if (pid > 0) {
uloop_process_delete(p);
p->cb(p, ret);
}
}
-
}
int uloop_interval_set(struct uloop_interval *timer, unsigned int msecs)
--
2.48.1
More information about the openwrt-devel
mailing list