[OpenWrt-Devel] mtd-split, spi-nor, squashfs and jffs2

Andrey Melnikov temnota.am at gmail.com
Thu Mar 12 12:41:57 EDT 2020


Can someone explain me logic of
generic/pending-4.14/411-mtd-partial_eraseblock_write.patch memory

hardware: tp-link mr3040 v2 with 16Mb spi-nor flash and 32Mb DDR
memory (u-boot by pepe2k), kernel-4.14.171 (OpenWrt GCC 9.2.0
flash organisation:
[    0.568427] m25p80 spi0.0: found w25q128, expected m25p80
[    0.583650] m25p80 spi0.0: w25q128 (16384 Kbytes)
[    0.587270] 5 tp-link partitions found on MTD device spi0.0
[    0.592453] Creating 5 MTD partitions on "spi0.0":
[    0.597290] 0x000000000000-0x000000020000 : "u-boot"
[    0.604187] 0x000000020000-0x0000001a75b4 : "kernel"
[    0.607700] mtd: partition "kernel" doesn't end on an erase/write
block boundary -- mark MTD_ERASE_PARTIAL
[    0.620231] 0x0000001a75b4-0x000000ff0000 : "rootfs"
[    0.623821] mtd: partition "rootfs" doesn't start on an erase/write
block boundary -- mark MTD_ERASE_PARTIAL
[    0.634667] mtd: device 2 (rootfs) set to be root filesystem
[    0.639242] 1 squashfs-split partitions found on MTD device rootfs
[    0.645429] Creating 1 MTD partitions on "rootfs":
[    0.650141] 0x000000468a4c-0x000000e48a4c : "rootfs_data"
[    0.658939] 0x000000ff0000-0x000001000000 : "art"
[    0.664556] 0x000000020000-0x000000ff0000 : "firmware"

symptom - board can't survive first boot, invoking OOM killer, crashed
later. jffs2 gc thread trigger this. Boot with KMEMLEAK detector show
huge leak inside part_erase() .

As you see, "rootfs" partition not aligned, so MTD_ERASE_PARTIAL flag
is set. squashfs parser (after commit
7703e14bc4f36758ac28eea3c2abce7591ed4b8d) align found partition to
erase block, so "rootfs_data" already aligned.
When jffs2 first time mount rootfs it erase free blocks. Typical
execution flow is: part_erase() check MTD_ERASE_PARTIAL for
"rootfs_data", call parent for erase (itself again, but with "rootfs"
mtd partition),
check again MTD_ERASE_PARTIAL for "rootfs",  allocates 1000h buffer,
read need data from device, call mtd->parent again,  now parent call
spi_nor_erase() to erase block, spi_nor_erase() calls callback
mtd_erase_callback() which
try to write back halves from buffer and free it when
instr->mtd->flags have MTD_ERASE_PARTIAL, return to spi_nor_erase()
and unwind all call's back.

- is code in 411 patch is correct? It's simple assumes that  parent is
aligned, slave - is not. And when parent partition (read "rootfs") not
aligned, slave (read "rootfs_data") aligned - it simply leak memory in
size of "rootfs_data" partition.
I think it should check instr->mtd->flags in part_erase() instead of mtd->flags.

And why we have two tplink partition parser
one in target/linux/generic/files/drivers/mtd/mtdsplit/mtdsplit_tplink.c
and second in target/linux/ar71xx/files/drivers/mtd/tplinkpart.c (with
copy in target/linux/ath79/files/drivers/mtd/tplinkpart.c)
maybe collapse it into one ?

openwrt-devel mailing list
openwrt-devel at lists.openwrt.org

More information about the openwrt-devel mailing list