bcm27xx: squashf/f2fs sysupgrade broken because overlay is not padded/erased

Lukas Zeller luz at plan44.ch
Wed May 17 08:58:56 PDT 2023

For quite a while, I've been updating my OpenWrt 22.03 based RPi devices, configured with sqashfs/f2fs layout (CONFIG_TARGET_ROOTFS_SQUASHFS, rather than CONFIG_TARGET_ROOTFS_EXT4FS) via sysupgrade with no problems.

However, during a recent update (22.03.5) where the squashfs area only grew very slightly from the previous image (+ one 64k block), all sysupgrades failed, leaving the devices in a state where they did start booting, but could neither mount nor create the overlay, so stalled in an unusable state.

I found that the SD card area immediately following the squashfs blocks was simply random garbage.

Apparently, in all upgrades before, the data was garbage enough for mount_root to detect no f2fs and creating a new one, so the issue did not surface.

In this particular case however, with only the first 64k block of the former f2fs overlay overwritten by the new squashfs, mount_root did detect an f2fs (maybe detecting the second copy of the f2fs superblock as the first?), but then failed to mount it:

  Press the [f] key and hit lenter] to enter failsafe mode
  Press the [1], [2], [3] or [4] key and hit [enter] to select the debug level
  [    9.5984921] F2FS-fs (loop0) : inconsistent node e block, nid:3, node footer[nid:0,ino:0,ofs:0,cpver:0,blkaddr:0]
  [    9.6144351] F2FS-fs (loop0): Failed to read root inode
  [    9.6302011] mount root: failed to mount -t. f2fs /dev/loop0 /tmp/overlay: Invalid argument
  mkdir: can't create directory '/boot' : Read-only file system
  mount: mounting /dev/mmcblk0p1 on /boot failed: No such file or directory

In consequence, the devices then hung in an inaccessible state.

As soon as I took the SD card out, and manually dd-ed a few hundred kB of zeroes right following the squashfs data, the devices, otherwise untouched, recovered and completed the sysupgrade (i.e. did create a new f2fs, and restored /boot/sysupgrade.tgz into it).

AFAIK the main difference between default_do_upgrade() and bcm27xx platform_do_upgrade() is that the former uses mtd, which takes care of integrating empty (in case of sysupgrade -n) or already populated (in case of regular sysupgrade) f2fs data at the end of squashfs, while the latter just dd-es the squashfs data with no padding.

The bcm27xx upgrade is different also because it saves the sysupgrade.tgz to the platform specific /boot FAT partition, and only restores this AFTER rebooting, in 79_move_config (via preinit_mount_root hook).

I really need help to even figure out a feasible idea to fix this, let alone implement it in a clean way. What comes to my mind so far is:

- trying to just zero out enough blocks after the squashfs data in `target/linux/bcm27xx/base-files/lib/upgrade/platform.sh` platform_do_upgrade() to make sure a subsequent mount_root will NOT attempt to mount broken remains of an f2fs. However, as this function also still handles ext4-only case and even multiple partition images, I'm not understanding this well enough to see how to do that properly.

- rearranging things to separate the squashfs/f2fs case from the platform specific ext4 case, i.e. using the standard way of creating the empty or restored overlay f2fs right after writing the new squashfs, as default_do_upgrade() does. Only on bcm27xx, there's no mtd, but mmc. So this would require even more in detail knowledge I don't have.

A post about this in the openwrt forum [1] got no echo so far, so I'm trying here to reach someone who really understands the peculiarities of the bcm27xx sysupgrade (maybe @stijn?) and could help.

Thanks in advance!

PS: while digging through this, I found a detail that might be the cause for another bcm27xx sysupgrade problem, see [2] - default_do_upgrade() flushes kernel inode caches since 3d12b479 (Nov 2020), but bcm27xx platform_do_upgrade() does not. Just mentioning it hoping someone with real knowledge might see through that issue, too...

[1] https://forum.openwrt.org/t/broken-f2fs-after-sysupgrade-on-rpi-config-not-restored/159703
[2] https://forum.openwrt.org/t/raspberrypi-sysupgrade-looses-overlay-when-boot-partition-gets-bigger/139138
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.openwrt.org/pipermail/openwrt-devel/attachments/20230517/69df9fcc/attachment.sig>

More information about the openwrt-devel mailing list