[OpenWrt-Devel] Sysupgrade and Failed to kill all processes

Wed May 13 18:06:35 EDT 2020

Inline...

> On May 12, 2020, at 11:17 PM, Michael Jones <mike at meshplusplus.com> wrote:
> 
> I've been investigating a problem with sysupgrade failing with the error message "Failed to kill all processes", and then hanging indefinitely.
> 
> This happens maybe once every 10-20 sysupgrades, and it's kind of a pain.
> 
> So far I've determined this workflow that the sysupgrade command follows. Note, I'm not aiming for 100% accuracy, but just broad strokes.
> 
> 
> 1) /sbin/sysupgrade locates the file to upgrade from on the filesystem, or if the second option to sysupgrade starts with http://, it downloads the firmware file using wget.
> 2) /sbn/sysupgrade does some minor validation of various things, and grabs whatever config files it thinks the end user wants to be restored and packs them up into some kind of tarball.
> 3) sysupgrade sends a message, via ubus, to procd, to initiate the upgrade.
> 4) Procd does some stuff which I haven't finished completely understanding just yet, but it looks like firmware verification to make sure we don't upgrade to a bad firmware file.
> 5) It *does not* appear that procd will proactively terminate services until everything (or almost everything) is shut down. Seems like something that should be added to increase reliability.
> 6) procd replaces itself (execvp systemcall) with the program /sbin/upgraded. This means that procd is *no longer running*, PID 1 is now /sbin/upgraded. So service management is not possible at this point.
> 7) /sbin/upgraded now acts as PID1. It executes the shell script /lib/upgrade/stage2 with parameters.
> 8) The shell script loops on all processes, and sends them the TERM signal, and then the KILL signal. See email subjec for problems with this.
> 9) the shell script creates a new ram filesystem, mounts it, then copies over a very small set of binaries into it.
> 10) The shell script changes root into the new ram filesystem
> 11) Inside the ramfilesystem, the shell script writes the upgraded firmware and saved configuration to disk
> 12) Reboot.
> 
> 
> Now that the very rough summary is out of the way, I have 4 questions.

How the entire upgrade process works would be a good subject for documenting on the Wiki if it’s not already.

> 1) I notice that the shell script /lib/upgrade/stage2 is doing a tight loop with kill -9 to terminate processes. However, it's only looping a maximum of 10 times, and its going as fast as the shell can loop. 
> 
> What's to stop this loop from quickly going through every process almost immediately 10 times, before a process that would be about to terminate terminates? The process in question may be handling some kind of IO, so the kernel wouldn't immediately terminate it.

How long are you thinking this I/O will take to complete?

> Shouldn't there be some very brief sleep at the end of each loop iteration to ensure that the processes that are going to practically terminate have done so?
> 
> 2) Why is the behavior on failure to terminate processes to just give up? That leaves devices hanging without any network connectivity. 

(1) It shouldn’t be happening very often.  Hopefully.
(2) If the box is in an indeterminate state then it’s not always clear that there’s a safe path forward, and sometimes this is something that a human needs to ascertain.
(3) You might also want to collect data about the failure so you can fix it and stop it from happening again.  Proceeding would efface all of that.

> A reboot with some logging on disk would allow for remote sysupgrades to have some kind of recoverability.

What if the failure left the box in a partially compromised state?  Would you want your firewall to “fail open”?  I wouldn’t.

> 3) Is looping over sigkill a reliable way to terminate all processes?

The man page for signal(2) says:

       The signals SIGKILL and SIGSTOP cannot be caught or ignored.

but yeah, if you’re in the kernel when the signal arrives, and you get stuck in there, then your process won’t go away and it becomes a moot point.

> I was under the impression that the only reliable way to ensure all processes terminate is to use cgroups, and put the processes to terminate in the freezer group and then kill them off after they've been frozen. Otherwise you have basically a race condition between the termination of processes and the creation of children. E.g. a fork-bomb could prevent all processes from being terminated.

That assumes you have a kernel with CGROUPS compiled in.

philipp at ubuntu16:~/lede$ grep -i cgroups .config
# CONFIG_KERNEL_CGROUPS is not set
philipp at ubuntu16:~/lede$ 

Also, if you have fork-bombs, why haven’t they brought down the system earlier?  And why would you have untrusted services/programs on your system in the first place?  This isn’t a general computing base with naive users picking up malware inadvertently, etc.  It’s a closed software ecosystem (in theory… how it gets mangled downstream is a different question).

> 4) Why doesn't procd, prior to execvp the /sbin/upgraded program, shutdown all the services that are running? 

I’m speculating but it could be for any number of reasons…  Keeping procd simple…  There might be ordering or dependency that requires doing the shutdown in a particular order… There might be services (like squid if socks or proxy web access is required) that might be needed by the upgrade process in some scenarios…  

> Maybe I'm just not seeing where it does this, so if that's the case, then I'm happy to be corrected.
> 
> But I'm under the impression that when not using cgroups, stopping all services would allow for anything that isn't double forked to be gracefully shutdown and cleaned up after itself.

You just lost me with that last bit.

When *not* using cgroups?  I thought you just argued for using cgroups to avoid the fork-race condition above…

_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel