[OpenWrt-Devel] Squashfs breakage lottery with UBI WAS: [PATCH RFC 2/2] amp821xx: use newly added pad-squashfs for Meraki MR24

Christian Lamparter chunkeey at gmail.com
Sat Aug 24 12:09:57 EDT 2019


On Saturday, August 24, 2019 2:18:55 AM CEST Russell Senior wrote:
> >>>>> "Christian" == Christian Lamparter <chunkeey at gmail.com> writes:
> 
> > I've posted a similar message to the bugreport:
> > <https://bugs.openwrt.org/index.php?do=details&task_id=2460>
> 
> I should have filed the bug first and linked it in my patch.
I think it's fine. It depends on whenever there will be a
discussion and where it will take place... But yeah, nobody
can tell in advance how this will go.

On Saturday, August 24, 2019 2:59:31 AM CEST Russell Senior wrote:
> >>>>> "Russell" == Russell Senior <russell at personaltelco.net> writes:
> 
> >>>>> "Christian" == Christian Lamparter <chunkeey at gmail.com> writes:
> 
> Russell> It's mostly inferred from the fact that I saw the error and
> Russell> kernel panic, and glancing at the kernel squashfs code. I am
> Russell> not pretending to understand completely how the squashfs kernel
> Russell> code works, but the position of the error relative to the size
> Russell> of the rootfs in my panic case strongly suggests it was trying
> Russell> to read past the end of the ubi volume.
> 
> Oh, and I got important help finding this from Jonas Gorski. I was
> remiss in not mentioning that.
> 
Ok, Let's add him to the CC then. As well as some of the 
"ramips: Fix and tidy up IMAGE_SIZE #2226" and 
"[RFC] Use DTS firmware partition to obsolete IMAGE_SIZE #2310"

https://github.com/openwrt/openwrt/pull/2226
https://github.com/openwrt/openwrt/pull/2310

crowd. Because this will likely affect them as well... 
But they might not know it. In any case: "Welcome everyone! :-D".

> > What's happening here is that mksquashfs4 is being told
> > through the "-nopad" option to "do not pad filesystem to a
> > multiple of 4K" [0].
> 
> > |define Image/mkfs/squashfs |
> > $(STAGING_DIR_HOST)/bin/mksquashfs4 $(call
> > mkfs_target_dir,$(1)) $@ \ | -nopad -noappend -root-owned \ |
> > -comp $(SQUASHFSCOMP) $(SQUASHFSOPT) \ | -processors 1 |endef
> 
> > My guess is that this affects more than just the MR24 (I'm
> > looking at you too: pad2jffs and various pad-to $value)
> > . I've tried tracking down the change that added the "-nopad"
> > and found it in a commit from 2005 titled: "add some changes
> > from whiterussian to head" [1] [2]:
> 
> I agree that other devices where rootfs is squashfs and lives on a ubi
> volume are probaby also vulnerable to this problem. Regrettably, I haven't
> thought of any other of those devices that I have on hand to test. 
> 
> > | $(KDIR)/root.squashfs: | @mkdir -p $(KDIR)/root/jffs |-
> > $(STAGING_DIR)/bin/mksquashfs-lzma $(KDIR)/root $@ -noappend
> > -root-owned -le |+ $(STAGING_DIR)/bin/mksquashfs-lzma
> > $(KDIR)/root $@ -nopad -noappend -root-owned -le
> 
> 
> > So, this is really old...
> 
> > Question is, should we just drop the -nopad? Since as you
> > established, in the described corner-case (see above)
> > squashfs needs this 4k padding and the generated squashfs
> > could be considered broken otherwise?
> 
> I'm under the impression that the -nopad makes sense for NOR flash where
> the kernel and rootfs are glued together, padding the isolated rootfs
> would cause alignment problems or wasted space in the combined blobs.

Yes, that's the nod to padjffs2. That said,
<https://sourceforge.net/p/squashfs/mailman/message/28307755/> makes
it sound like that apart from the BLOCKSIZE, we also need to PAGE_SIZE?

(I think the APM821XX is a special case, since it can do 64KiB Pages
as well as it's 32MiB SLC NAND that uses 16 KiB erase-blocks. So a
PAGE can span up to 4 pages.

> 
> > (Judging from your
> > message, you went through the kernel code. Can you please
> > share the place where the lack of the padding is breaking the
> > fs?)
> 
> It's mostly inferred from the fact that I saw the error and kernel
> panic, and glancing at the kernel squashfs code. I am not pretending to
> understand completely how the squashfs kernel code works, but the
> position of the error relative to the size of the rootfs in my panic
> case strongly suggests it was trying to read past the end of the ubi
> volume.
> 
> The error comes in the kernel's fs/squashfs/block.c in the
> squashfs_read_data() function. There are a bunch of conditions (at least
> 5) within the function (see "goto read_failure;") that will lead to that
> message being printed.
> 
Well, that's a pity this could have saved a lot of time.

I've cobbered together a patch that deals with some of the
padding issues at "ubimkvol" and "ubinize" time. The idea
is that ideally we want to do the padding when we know
PAGE_SIZE and the BLOCKSIZE/Erasesize (MR24 blocksize in
image/Makefile seems wrong as well...).

But for now, it's set to 64KiB. If this is the way forward
we add enable getconf and get the PAGESIZE at runtime. If not,
we need to come up with something else.
(It's also possible to do some changes in  ubi's block code or
squashfs kernel code to mitigate the padding, but I don't think
the maintainers will even look at it).


Regards,
Christian
---


More information about the openwrt-devel mailing list