[FS#3241] temporary flash failure on ipq40xx device (wpj428)

OpenWrt Bugs openwrt-bugs at lists.openwrt.org
Tue Jul 21 10:22:09 EDT 2020


A new Flyspray task has been opened.  Details are below. 

User who did this - Leon George (yogo1212) 

Attached to Project - OpenWrt/LEDE Project
Summary - temporary flash failure on ipq40xx device (wpj428)
Task Type - Bug Report
Category - Base system
Status - Unconfirmed
Assigned To - 
Operating System - All
Severity - Medium
Priority - Very Low
Reported Version - Trunk
Due in Version - Undecided
Due Date - Undecided
Details - Hello :-)

My employer has noticed a small fraction of devices failing with a trunk-based software image (OpenWrt SNAPSHOT, r13134+521-f57230c4e6) on the WPJ428 platform (ipq40xx).

Messages like these appear in syslog:

Tue Jul 21 13:16:39 2020 daemon.err node-comm[27021]: Error loading shared library libevent_openssl-2.1.so.7: I/O error (needed by /usr/bin/node-comm-mqtt)
Tue Jul 21 13:16:39 2020 kern.err kernel: [523126.625066] SQUASHFS error: Unable to read fragment cache entry [3c56aa]
Tue Jul 21 13:16:39 2020 kern.err kernel: [523126.625114] SQUASHFS error: Unable to read page, block 3c56aa, size 1522c

After reboot, the problem goes away (probably because it's very unlikely to appear twice in a row).

The problem occurs with various flash chip revisions, so we believe it is a driver issue.

On an (un-)lucky day, the error occured on my device and i created two dumps of /dev/mtd8ro (the whole 32M of flash), one while error was occuring and another after the reboot.
1290 consecutive bytes are read as FF in the error state (reliably when running dd multiple times).

The diff from before and after the reboot looks like this (`cmp -l` output converted to hex, xx for redacted bytes):

01BB36BD FF xx
01BB36BE FF xx
01BB36BF FF xx
01BB36C0 FF xx
01BB3BC7 FF xx

The syslog from above belongs to the same occurance as diff.
It's worth noting that the file that couldn't be read is in the ROM portion of the flash while the offset of the diff is near the end.

I've reached the limits of my knowledge. If there's anything else that would be interesting to know from the error state, let me know, i'll see what i can do.

More information can be found at the following URL:

You are receiving this message because you have requested it from the Flyspray bugtracking system.  If you did not expect this message or don't want to receive mails in future, you can change your notification settings at the URL shown above.

More information about the openwrt-bugs mailing list