[OpenWrt-Devel] Squashfs breakage lottery with UBI WAS: [PATCH RFC 2/2] amp821xx: use newly added pad-squashfs for Meraki MR24

Christian Lamparter chunkeey at gmail.com
Sat Aug 31 09:31:32 EDT 2019


Hello,

On Saturday, August 31, 2019 2:09:55 PM CEST Jonas Gorski wrote:
> On Sat, 31 Aug 2019 at 01:19, Christian Lamparter <chunkeey at gmail.com> wrote:
> >
> > On Friday, August 30, 2019 11:10:54 PM CEST Russell Senior wrote:
> > > >>>>> "Christian" == Christian Lamparter <chunkeey at gmail.com> writes:
> > >
> > > Christian> Ok.
> > >
> > > Christian> I did push a patch titled: "build: remove harmful -nopad
> > > Christian> option from mksquashfs"
> > > Christian> <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875>
> > >
> > > Christian> so, let's see if this triggers more undefined behaviour and
> > > Christian> exposes more hidden broken code.
> > >
> > > Just to re-iterate my earlier worry, if for example:
> > >
> > >   kernel_size + rootfs_with_padding_size
> > >
> > > crosses an erase block boundary that:
> > >
> > >   kernel_size + rootfs_without_padding_size
> > >
> > > does not, then we'll be wasting an erase block of flash space on NOR.
> > Not to worry. I guess that part of the magic is also explained in the wiki:
> > <https://openwrt.org/docs/techref/flash.layout#example_flash_partitioning>
> > <https://openwrt.org/docs/techref/flash.layout#explanations>
> > In case my attempt now gets too confusing :-/.
> >
> > The "rootfs" partition also contains the rootfs_data inside. It should
> > be possible to verify this with any of the routers that use the "firmware"
> > parition label.
> >
> > Please check /proc/mtd. For example on my WNDR3700v2 (which is very similar
> > to your WNDR3800). It looks like this
> >
> > |# cat /proc/mtd
> > |dev:    size   erasesize  name
> > |mtd0: 00050000 00010000 "u-boot"
> > |mtd1: 00020000 00010000 "u-boot-env"
> > |mtd2: 00f80000 00010000 "firmware"
> > |mtd3: 0018c440 00010000 "kernel"
> > |mtd4: 00df3bc0 00010000 "rootfs"
> > |mtd5: 00620000 00010000 "rootfs_data"
> > |mtd6: 00010000 00010000 "art
> >
> > The rootfs is 14629824 Bytes = 13.95 MiB. (The kernel + uboot + env + art
> > fills out the remaining ~2MiB to a total of 16 MiB). So the padding we both
> > are talking about already has to exists between the squashfs portion and the
> > jffs2/overlay portion inside the "rootfs" partition and it's a full erase-size
> > block. Sadly, OpenWrt does not readily print the "end" of just the squashfs
> > partition (as it does with the kernel partition above) and your message from
> > earlier with the three routers didn't include the "squashfs-split" and its
> > results. (I think that maybe this is the missing information that got lost?)
> 
> After a bit more investigation, those devices that use padjffs2 (or
> have the rootfs
> start at an at least 4k aligned offset) will be fine.
> 
> The issue are those devices with an unaligned rootfs start, which put their own
> EOF marker in the image by the firmware util (e.g. brcm63xx or tp-link devices).
> 
> There it can happen that e.g. we get
> 
> 0x18fff2  <squashfs end> 00 00 00 00
> 0x19000 00 00 00 00 ... <end of padding> FF FF
> 0x19010 FF FF FF ...
> *
> 0x20000 DE AD C0 DE
> 
> The 0xff are only a guess, I haven't checked what the different
> firmware utils use for padding the end of the rootfs - but mksquashfs
> definitely uses 0x00.
> 
> which leads to:
> 
> 1. squashfs-split will put the roofs_data partition start at 0x19000
> because that's the first aligned erase block after the end of the
> rootfs start + squashfs length
> 2. jffs2 will see that the first block is neither a valid jff2 node,
> an EOF marker, nor all 0xff
> 3. jffs2 will refuse to erase/mount the filesystem (AFAIU the code) [1]
> 
> So removing the -nopad might actually break those devices.
> 
> We could fix this case by making sure that mksquashfs and all firmware
> utils use 0xff's to pad (so the erase block will then be treated as
> empty/all 0xff). But then there is the question how jffs2 reacts if
> the first block is 0xff, and the second block is a valid jffs2 node,
> which happens when we sysupgrade with keeping config on NOR devices.
> The jffs2 code isn't the most readable code ... .
>
No need to worry, see one of the previous mails in this thread:

http://lists.infradead.org/pipermail/openwrt-devel/2019-August/018638.html

It contains a patch at the end titled:
"[PATCH] base-files: pad root.squashfs to 64KiB in ubi volumes"
This is another approach that just deals with the UBI+squashfs
issue but works with "-nopad". Soooooo.... do we all agree there?  

Cheers,
Christian



_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel



More information about the openwrt-devel mailing list