[OpenWrt-Devel] Squashfs breakage lottery with UBI WAS: [PATCH RFC 2/2] amp821xx: use newly added pad-squashfs for Meraki MR24

Jonas Gorski jonas.gorski at gmail.com
Sat Aug 31 08:09:55 EDT 2019


Hi,

On Sat, 31 Aug 2019 at 01:19, Christian Lamparter <chunkeey at gmail.com> wrote:
>
> On Friday, August 30, 2019 11:10:54 PM CEST Russell Senior wrote:
> > >>>>> "Christian" == Christian Lamparter <chunkeey at gmail.com> writes:
> >
> > Christian> Ok.
> >
> > Christian> I did push a patch titled: "build: remove harmful -nopad
> > Christian> option from mksquashfs"
> > Christian> <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875>
> >
> > Christian> so, let's see if this triggers more undefined behaviour and
> > Christian> exposes more hidden broken code.
> >
> > Just to re-iterate my earlier worry, if for example:
> >
> >   kernel_size + rootfs_with_padding_size
> >
> > crosses an erase block boundary that:
> >
> >   kernel_size + rootfs_without_padding_size
> >
> > does not, then we'll be wasting an erase block of flash space on NOR.
> Not to worry. I guess that part of the magic is also explained in the wiki:
> <https://openwrt.org/docs/techref/flash.layout#example_flash_partitioning>
> <https://openwrt.org/docs/techref/flash.layout#explanations>
> In case my attempt now gets too confusing :-/.
>
> The "rootfs" partition also contains the rootfs_data inside. It should
> be possible to verify this with any of the routers that use the "firmware"
> parition label.
>
> Please check /proc/mtd. For example on my WNDR3700v2 (which is very similar
> to your WNDR3800). It looks like this
>
> |# cat /proc/mtd
> |dev:    size   erasesize  name
> |mtd0: 00050000 00010000 "u-boot"
> |mtd1: 00020000 00010000 "u-boot-env"
> |mtd2: 00f80000 00010000 "firmware"
> |mtd3: 0018c440 00010000 "kernel"
> |mtd4: 00df3bc0 00010000 "rootfs"
> |mtd5: 00620000 00010000 "rootfs_data"
> |mtd6: 00010000 00010000 "art
>
> The rootfs is 14629824 Bytes = 13.95 MiB. (The kernel + uboot + env + art
> fills out the remaining ~2MiB to a total of 16 MiB). So the padding we both
> are talking about already has to exists between the squashfs portion and the
> jffs2/overlay portion inside the "rootfs" partition and it's a full erase-size
> block. Sadly, OpenWrt does not readily print the "end" of just the squashfs
> partition (as it does with the kernel partition above) and your message from
> earlier with the three routers didn't include the "squashfs-split" and its
> results. (I think that maybe this is the missing information that got lost?)

After a bit more investigation, those devices that use padjffs2 (or
have the rootfs
start at an at least 4k aligned offset) will be fine.

The issue are those devices with an unaligned rootfs start, which put their own
EOF marker in the image by the firmware util (e.g. brcm63xx or tp-link devices).

There it can happen that e.g. we get

0x18fff2  <squashfs end> 00 00 00 00
0x19000 00 00 00 00 ... <end of padding> FF FF
0x19010 FF FF FF ...
*
0x20000 DE AD C0 DE

The 0xff are only a guess, I haven't checked what the different
firmware utils use for padding the end of the rootfs - but mksquashfs
definitely uses 0x00.

which leads to:

1. squashfs-split will put the roofs_data partition start at 0x19000
because that's the first aligned erase block after the end of the
rootfs start + squashfs length
2. jffs2 will see that the first block is neither a valid jff2 node,
an EOF marker, nor all 0xff
3. jffs2 will refuse to erase/mount the filesystem (AFAIU the code) [1]

So removing the -nopad might actually break those devices.

We could fix this case by making sure that mksquashfs and all firmware
utils use 0xff's to pad (so the erase block will then be treated as
empty/all 0xff). But then there is the question how jffs2 reacts if
the first block is 0xff, and the second block is a valid jffs2 node,
which happens when we sysupgrade with keeping config on NOR devices.
The jffs2 code isn't the most readable code ... .


Regards
Jonas

[1]  https://elixir.bootlin.com/linux/latest/source/fs/jffs2/scan.c#L263

all dirty nodes will increase neither c->nr_free_blocks nor
empty_blocks nor badblocks, so the sum will be different from
c->nr_blocks, so jffs2 will refuse to mount

_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel



More information about the openwrt-devel mailing list