[FS#3943] ujail breaks after some uptime

OpenWrt Bugs openwrt-bugs at lists.openwrt.org
Fri Oct 29 16:36:46 PDT 2021


THIS IS AN AUTOMATED MESSAGE, DO NOT REPLY.

The following task has been changed.  The changes are listed below.  For full information about what has changed, visit the URL and click the History tab.

FS#3943 - ujail breaks after some uptime
User who did this: Stijn Tintel (stintel)
Task details edited:
-------
When a device with procd-ujail installed has been running for a while (hit it today with 28d uptime), restarting dnsmasq results in dnsmasq no longer being started, there is only the ujail process. There are no errors displayed on stdout/stderr while restarting, nor in syslog.


root at ar0:~# /etc/init.d/dnsmasq restart
udhcpc: started, v1.33.1
udhcpc: sending discover
udhcpc: no lease, failing
udhcpc: started, v1.33.1
udhcpc: sending discover
udhcpc: no lease, failing
udhcpc: started, v1.33.1
udhcpc: sending discover
udhcpc: no lease, failing
udhcpc: started, v1.33.1
udhcpc: sending discover
udhcpc: no lease, failing



Tue Jul 20 15:17:15 2021 user.notice dnsmasq: DNS rebinding protection is active, will discard upstream RFC1918 responses!
Tue Jul 20 15:17:15 2021 user.notice dnsmasq: Allowing 127.0.0.0/8 responses
Tue Jul 20 15:17:15 2021 user.notice dnsmasq: Allowing RFC1918 responses for domain plex.direct



root at ar0:~# ps aux | grep dnsmasq
root     21289  0.0  0.0   2088   872 ?        S    15:17   0:00 /sbin/ujail -n dnsmasq -u -l -r /dev/null -r /dev/urandom -r /etc/TZ -r /etc/dnsmasq.conf -r /etc/ethers -r /etc/group -r /etc/hosts -r /etc/passwd -r /sbin/hotplug-call -r /tftpboot -r /tmp/dnsmasq.d -r /tmp/etc/dnsmasq.conf.main -r /tmp/hosts/dhcp.main -r /usr/lib/dnsmasq/dhcp-script.sh -r /usr/share/dnsmasq/dhcpbogushostname.conf -r /usr/share/dnsmasq/rfc6761.conf -r /usr/share/dnsmasq/trust-anchors.conf -w /var/lib/dhcp.leases -w /var/run/dnsmasq/ -- /usr/sbin/dnsmasq -C /tmp/etc/dnsmasq.conf.main -k -x /var/run/dnsmasq/dnsmasq.main.pid
root     21455  0.0  0.0   1132   468 pts/1    S+   15:19   0:00 grep dnsmasq
root at ar0:~# ss -anput | grep dnsmasq
root at ar0:~#


Commenting out the lines in the init script starting with procd_add_jail and then restarting the service solves the problem. The problem also does not occur when dnsmasq is started during boot.

I've seen this problem before, mentioned it a few times on IRC, the first time was in October 2020, so before 21.02 was branched, so it's very likely this problem exists there as well.

I didn't reboot the system where I'm currently experiencing this, I've commented out the procd_add_jail lines instead. Uncommenting those lines brings back the problem, so further investigation is possible.

This seems to be a general problem with ujail, as even a simple echo refuses to start:


# ujail -d1 -n blah -r /tmp -- /bin/echo test
jail: adding mount /tmp /tmp bind(1) ro(1) err(0)
jail: Using namespaces(0x28020000), capabilities(0), seccomp(0)
jail: adding mount /bin/echo /bin/echo bind(1) ro(1) err(1)
jail: adding mount /lib/ld-musl-x86_64.so.1 /lib/ld-musl-x86_64.so.1 bind(1) ro(1) err(1)
jail: adding library /lib/libgcc_s.so.1 (libgcc_s.so.1)
jail: adding library /lib/libc.so (libc.so)


The process hangs here until killed with kill -9.
Running in strace, the process hangs on epoll_pwait.
Backtrace with gdbserver:


#0  epoll_pwait (fd=3, ev=ev at entry=0x7ffff7f802c0 , cnt=cnt at entry=10, to=1834444156, sigs=sigs at entry=0x0) at ./arch/x86_64/syscall_arch.h:61
#1  0x00007ffff7fa7ada in epoll_wait (fd=, ev=ev at entry=0x7ffff7f802c0 , cnt=cnt at entry=10, to=)
    at src/linux/epoll.c:36
#2  0x00007ffff7f7805f in uloop_fetch_events (timeout=)
    at /home/stijn/Development/OpenWrt/openwrt/build_dir/target-x86_64_musl/libubox-2021-08-19-d716ac4b/uloop-epoll.c:73
#3  uloop_run_events (timeout=)
    at /home/stijn/Development/OpenWrt/openwrt/build_dir/target-x86_64_musl/libubox-2021-08-19-d716ac4b/uloop.c:170
#4  uloop_run_timeout (timeout=-1) at /home/stijn/Development/OpenWrt/openwrt/build_dir/target-x86_64_musl/libubox-2021-08-19-d716ac4b/uloop.c:555
#5  0x000055555555b915 in ?? ()
#6  0x00007ffff7f69bb8 in ?? ()
#7  0xffffffff01203ff2 in ?? ()
#8  0x00007ffff7ffd880 in ?? () from /home/stijn/Development/OpenWrt/openwrt/scripts/../staging_dir/target-x86_64_musl/root-x86/lib/ld-musl-x86_64.so.1
#9  0x00007ffff7fe30a0 in do_init_fini (queue=) at ldso/dynlink.c:1545
#10 0x00007ffff7ff80e0 in ?? () from /home/stijn/Development/OpenWrt/openwrt/scripts/../staging_dir/target-x86_64_musl/root-x86/lib/ld-musl-x86_64.so.1
#11 0x00007fffffffecb8 in ?? ()
#12 0x00007ffff7fa5d3c in libc_start_main_stage2 (main=0x6b77814f93720b1f, argc=1431674893, argv=0x55555555a00d) at src/env/__libc_start_main.c:94
#13 0x000055555555ba5b in ?? ()
#14 0x0000000000000008 in ?? ()
#15 0x00007fffffffeeca in ?? ()
#16 0x00007fffffffeed6 in ?? ()
#17 0x00007fffffffeed9 in ?? ()
#18 0x00007fffffffeede in ?? ()
#19 0x00007fffffffeee1 in ?? ()
#20 0x00007fffffffeee6 in ?? ()
#21 0x00007fffffffeee9 in ?? ()
#22 0x00007fffffffeef3 in ?? ()
#23 0x0000000000000000 in ?? ()

-------

More information can be found at the following URL:
https://bugs.openwrt.org/index.php?do=details&task_id=3943

You are receiving this message because you have requested it from the Flyspray bugtracking system.  If you did not expect this message or don't want to receive mails in future, you can change your notification settings at the URL shown above.



More information about the openwrt-bugs mailing list