[OpenWrt-Devel] Regression in handling power cuts since 3a1e819b4e80 ("ovl: store file handle of lower inode on copy up")

Rafał Miłecki zajec5 at gmail.com
Fri Oct 19 08:31:29 EDT 2018


Since OpenWrt switch from kernel 4.9 to 4.14 users started randomly
reporting file system corruptions. OpenWrt uses overlay(fs) with
squashfs as lowerdir and ubifs as upperdir. Russell managed to isolate
& describe test case for reproducing corruption when doing a power cut
after first boot.

Interestingly it cannot be reproduced on all devices (NAND dependant?
arch dependant?!). I couldn't reproduce that problem on none of my
Broadcom devices (ARM=y ARCH_BCM_5301X=y) so I had to buy Ubiquiti
EdgeRouter X (ER-X) (MIPS=y RALINK=y). I reproduced it then and
bisected down to the commit 3a1e819b4e80 ("ovl: store file handle of
lower inode on copy up").

FWIW I was told it also affects:
Asus RT-AC58U (ARCH_IPQ40XX=y)
RB493G, DIR-860L (ATH79=y)

Steps to reproduce the problem:
1) Flash firmware
2) Boot (for the first time)
3) Let the init script copy config files from lowerdir to the upperdir
4) Wait for boot to finish
5) Verify content of some unmodified config on overlay, using either:
hexdump -C /etc/config/dropbear
hexdump -C /overlay/upper/etc/config/dropbear
6) Power cut & boot again
7) Check the content of the same file

After above regressing commit the later check confirms the file size
looks correct but it's filled with all 00-es only.

Can I ask you to check if there is something possibly wrong with the
above ovl commit? Or does it expose some problem with the ubifs? Or
maybe the whole UBI?

FWIW testing above commit (and one before it) always results in single
error in the kernel log:
[   14.250184] UBIFS error (ubi0:1 pid 637): ubifs_add_orphan: orphaned twice

That UBIFS error doesn't occur with 4.12.14. Unfortunately it's
impossible to cleanly revert 3a1e819b4e80 from the top of 4.12.14.


openwrt-devel mailing list
openwrt-devel at lists.openwrt.org

More information about the openwrt-devel mailing list