Interactive boot

Fri Dec 24 15:14:19 PST 2021

Hello,

Over the past few years I've had to troubleshoot a lot of things in the
boot process and I finally decided that I need an interactive boot
feature to OpenWRT.  It will make troubleshooting and debugging easier,
but it will also provide a nice way to hack the boot process and explore
how OpenWRT works and interacts with the kernel.

I have it working all through initd, kmodloader for /etc/modules-boot.d
(preinit still occurs atomically) and part of procd's init process, but
now that I better understand the whole of the init process and related
sources, it's time for me to redesign it and start over.

My current patch set is based upon v21.02.0 and is enabled by a new
CONFIG_TARGET_DIAGNOSTICS under menuconfig --> "Image configuration". 
Makefiles for libubox, procd and ubox pick up this value and pass
-DDIAGNOSTICS=1 to CMake which adds the same to CPPFLAGS.  Thus, when it
is not enabled, there is zero or nearly zero increase to executable size
(.text, .data, .bss, etc.) or performance degredation and I would
propose rolling failsafe boot into this option, since they are both
insecure.  The feature is triggered by adding the "debug" or "debug=1"
to /proc/cmdline and I've also added options to set the debug level for
a few programs as well:

[    5.384877] mtk_soc_eth 10100000.ethernet eth0: port 4 link up (10Mbps/Full duplex)
[    6.578771] init: Console is alive
[    6.589791] init: init_debug    2
[    6.590002] init: debug         1
[    6.597260] init: kmod_debug    2
[    6.604249] init: plugd_debug   0
[    6.611317] init: preinit_debug 0
[    6.618390] init: procd_debug   2
[    6.625462] init: nowatchdog    1
[    6.632533] init: Initing diagnostics

"nowatchdog" is mostly there for running kdb/kgdb and everything else
works (or eventually will) with watchdog enabled.  (I'll also eventually
need a way to manage the console transition for launching kgdb over the
serial port.)

After hacking at this some, I thought that having a "diagd" daemon that
controls the console would be the best approach and using ubus would
probably be the optimal way to facilitate that, but we don't have ubus
until fairly late the STATE_UBUS stage of procd.  I could possibly just
have a non-ubus daemon with it's own socket, but that seems like a waste.

I haven't dug deeply into ubus sources yet, so what does ubus need to
run aside from /tmp being mounted?

Another option is similar to what I'm doing now -- a collection of
functions in libubox that init, procd, kmodloader, udevtrigger, etc.
call to provide the functionality.  But this makes state management a mess.

I use Gentoo on my workstation and my initial thought was an interactive
process similar to that -- a linear set of services that should be
started or skipped.  But in OpenWRT we have several levels, more like
stepping through code in a debugger.  The transition from initd to procd
with PID=1 doesn't need a logical separation, so I keep these at the
same level and end up with this:

enum initd_state {
	STATE_INITD_EARLY,	/* initd early() */
	STATE_KMOD_BOOT,	/* initd: /sbin/kmodloader /etc/modules-boot.d */
	STATE_PLUGD,		/* initd: /sbin/procd -h /etc/hotplug-preinit.json */
	STATE_PREINIT,		/* initd: /bin/sh /etc/preinit */
	STATE_CHECK_SYSUPGRADE,	/* initd: check_sysupgrade() */
	STATE_PROCD,		/* execvp /sbin/procd */
	STATE_PROCD_EARLY,	/* procd: hotplug and coldplug */
	STATE_UBUS,		/* procd: start ubus and connect to it */
	STATE_INITTAB,		/* procd: inittab */
	STATE_RUNNING,
	STATE_SHUTDOWN,
	STATE_HALT,

	STATE_COUNT
};

So when walking through STATE_KMOD_BOOT, I have prompts something like this:

[    4.320805] Run /sbin/init as init process
...
[    6.632533] init: Initing diagnostics
KMOD_BOOT [y/n], [f/F]inish, [c]ontinue, [s]hell, [m]ore
y
[  178.174792] init: Starting /sbin/kmodloader
[  178.299030] kmodloader: Interactive mode enabled.
[  186.055402] kmodloader: loading kernel modules from /etc/modules-boot.d/*
KMOD_BOOT: crypto_hash [y/n], [f]inish, [c]ontinue, [s]hell, [m]ore
m
KMOD_BOOT: crypto_hash [y/n], [f]inish, [c]ontinue, [s]hell, re[b]oot, debu[g], [h]elp, [l]ess
h
[y]es      start/run 'KMOD_BOOT: crypto_hash'
[n]o       skip it
[f]inish   finish all of KMOD_BOOT
[c]ontinue exit interactive mode and continue
[s]hell    run a shell
re[b]oot   immediately reboot without syncing (requires sysrq kernel support)
[d]ebug    start kdb (requires kernel support) (FIXME: disable watchdog)
[m]ore     show all options
[l]ess     show fewer options

Of course, this is pretty visually ugly right now, I think it would be
easier to read if the levels were displayed something like this (and
probably with some ANSI colors):

kmodloader (modules-boot.d):
    usb-common:
        [y/n], [f]inish, [c]ontinue, [s]hell, [m]ore?

And for services at state STATE_INITTAB:

inittab:
    sysinit:
        dropbear:
            [y/n], [f]inish, [c]ontinue, [s]hell, [m]ore?

So with that explained, my intention is that from any level you can
finish that boot level without any further prompting or exit interactive
mode entirely.  Thus, the calling program needs to know if the 'c'
option was chosen by it's child.  If I don't use a "diagd" daemon, the
other option I considered for state changes is just to keep the
diagnostic state in files under /tmp/ubox-diag/.

Any thoughts on any of this and especially better design ideas are
appreciated.

Also, I would like to eventually add an interactive feature to change
the debug level for any of the programs involved.

Thanks,
Daniel