bcm27xx: squashf/f2fs sysupgrade broken because overlay is not padded/erased
luz at plan44.ch
Wed May 17 08:58:56 PDT 2023
For quite a while, I've been updating my OpenWrt 22.03 based RPi devices, configured with sqashfs/f2fs layout (CONFIG_TARGET_ROOTFS_SQUASHFS, rather than CONFIG_TARGET_ROOTFS_EXT4FS) via sysupgrade with no problems.
However, during a recent update (22.03.5) where the squashfs area only grew very slightly from the previous image (+ one 64k block), all sysupgrades failed, leaving the devices in a state where they did start booting, but could neither mount nor create the overlay, so stalled in an unusable state.
I found that the SD card area immediately following the squashfs blocks was simply random garbage.
Apparently, in all upgrades before, the data was garbage enough for mount_root to detect no f2fs and creating a new one, so the issue did not surface.
In this particular case however, with only the first 64k block of the former f2fs overlay overwritten by the new squashfs, mount_root did detect an f2fs (maybe detecting the second copy of the f2fs superblock as the first?), but then failed to mount it:
Press the [f] key and hit lenter] to enter failsafe mode
Press the , ,  or  key and hit [enter] to select the debug level
[ 9.5984921] F2FS-fs (loop0) : inconsistent node e block, nid:3, node footer[nid:0,ino:0,ofs:0,cpver:0,blkaddr:0]
[ 9.6144351] F2FS-fs (loop0): Failed to read root inode
[ 9.6302011] mount root: failed to mount -t. f2fs /dev/loop0 /tmp/overlay: Invalid argument
mkdir: can't create directory '/boot' : Read-only file system
mount: mounting /dev/mmcblk0p1 on /boot failed: No such file or directory
In consequence, the devices then hung in an inaccessible state.
As soon as I took the SD card out, and manually dd-ed a few hundred kB of zeroes right following the squashfs data, the devices, otherwise untouched, recovered and completed the sysupgrade (i.e. did create a new f2fs, and restored /boot/sysupgrade.tgz into it).
AFAIK the main difference between default_do_upgrade() and bcm27xx platform_do_upgrade() is that the former uses mtd, which takes care of integrating empty (in case of sysupgrade -n) or already populated (in case of regular sysupgrade) f2fs data at the end of squashfs, while the latter just dd-es the squashfs data with no padding.
The bcm27xx upgrade is different also because it saves the sysupgrade.tgz to the platform specific /boot FAT partition, and only restores this AFTER rebooting, in 79_move_config (via preinit_mount_root hook).
I really need help to even figure out a feasible idea to fix this, let alone implement it in a clean way. What comes to my mind so far is:
- trying to just zero out enough blocks after the squashfs data in `target/linux/bcm27xx/base-files/lib/upgrade/platform.sh` platform_do_upgrade() to make sure a subsequent mount_root will NOT attempt to mount broken remains of an f2fs. However, as this function also still handles ext4-only case and even multiple partition images, I'm not understanding this well enough to see how to do that properly.
- rearranging things to separate the squashfs/f2fs case from the platform specific ext4 case, i.e. using the standard way of creating the empty or restored overlay f2fs right after writing the new squashfs, as default_do_upgrade() does. Only on bcm27xx, there's no mtd, but mmc. So this would require even more in detail knowledge I don't have.
A post about this in the openwrt forum  got no echo so far, so I'm trying here to reach someone who really understands the peculiarities of the bcm27xx sysupgrade (maybe @stijn?) and could help.
Thanks in advance!
PS: while digging through this, I found a detail that might be the cause for another bcm27xx sysupgrade problem, see  - default_do_upgrade() flushes kernel inode caches since 3d12b479 (Nov 2020), but bcm27xx platform_do_upgrade() does not. Just mentioning it hoping someone with real knowledge might see through that issue, too...
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 488 bytes
Desc: Message signed with OpenPGP
More information about the openwrt-devel