[OpenWrt-Devel] [PATCH] kernel: tolerate using UBI/UBIFS on MLC flash (FS#1830)

Koen Vandeputte koen.vandeputte at ncentric.com
Fri Oct 26 04:50:40 EDT 2018

On 25.10.18 23:43, Christian Lamparter wrote:
> On Thursday, October 25, 2018 10:57:48 AM CEST Koen Vandeputte wrote:
>> On 23.10.18 17:43, Christian Lamparter wrote:
>>> Sorry, hit "Send" by accident
>>> On Tuesday, October 23, 2018 2:37:16 PM CEST Koen Vandeputte wrote:
>>>> On 22.10.18 19:27, Christian Lamparter wrote:
>>>>> On Monday, October 22, 2018 3:48:29 PM CEST Koen Vandeputte wrote:
>>>>>> On 20.10.18 17:46, Hauke Mehrtens wrote:
>>>>>>> On 10/18/2018 02:28 PM, Koen Vandeputte wrote:
>>>>>>>> starting from upstream commit 577b4eb23811 ("ubi: Reject MLC NAND")
>>>>>>>> it is not allowed to use UBI and UBIFS on a MLC flavoured NAND flash chip. [1]
>>>>>>>> According to David Oberhollenzer [2]:
>>>>>>>> The real problem is that on MLC NAND, pages come in pairs.
>>>>>>>> Multiple voltage levels inside a single, physical memory cell are used to
>>>>>>>> encode more than one bit. Instead of just having pages that are twice as big,
>>>>>>>> the flash exposes them as *two different pages*. Those pages are usually not
>>>>>>>> ordered sequentially either, but according to a vendor/device specific
>>>>>>>> pairing scheme.
>>>>>>>> Within OpenWrt, devices utilizing this type of flash,
>>>>>>>> combined with ubi(fs) will be bricked when a user upgrades
>>>>>>>> from 17.01.4 to a newer version as the MLC will be refused.
>>>>>>>> As these devices are currently advertised as supported by OpenWrt,
>>>>>>>> we should at least maintain the original state during the lifecycle
>>>>>>>> of the current releases.
>>>>>>>> Support can be gracefully ended when a new release-branch is created.
>>>>>>>> Signed-off-by: Koen Vandeputte <koen.vandeputte at ncentric.com>
>>>>>>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v4.14.77&id=577b4eb23811dfc8e38924dc476dbc866be74253
>>>>>>>> [2] https://lore.kernel.org/patchwork/patch/920344/
>>>>>>>> ---
>>>>>>>> Mainly intended for discussion first on this approach before applying it.
>>>>>>>> Can be cherrypicked to 18.06.
>>>>>>>> Feel free to drop your (n)ack on this approach
>>>>>>> Have you checked if these are really MLC chips or if they are just
>>>>>>> getting detected wrongly?
>>>>>>> I think I saw some SPI NAND chips which a patched Linux detected as MLC
>>>>>>> but the datasheet said they are SLC chips.
>>>>>>> Hauke
>>>>>> Very good point.
>>>>>> I've requested Mikrotik this morning to provide some details about the
>>>>>> actual chips being used since the launch of that board ..
>>>>> For the RB450/G you can take a look at the User Guide on their side:
>>>>> <https://mikrotik.com/product/RB450G#fndtn-downloads>
>>>>> <https://i.mt.lv/cdn/rb_files/rb450GugA.pdf>
>>>>> On Page 3 there's a "System Board View" with a bottom view of the PCB
>>>>> and this is where the NAND chip is located. It reads:
>>>>> HY27UT084G2A
>>>>> This translates to:
>>>>> <http://natisbad.org/NAS/refs/Hynix_NAND_flash_part_number_decoding.pdf>
>>>>> HY27UT084G2A
>>>>>      ||||
>>>>>      |||^--- T = MLC + Single Die + Large Block
>>>>>      ||^---U = 2.7V~3.6V
>>>>>      |^---7 = NAND FLASH
>>>>>      ^---2 = FLASH
>>>>> So, it is NAND MLC FLASH.
>>>> Received a reply from Mikrotik tech dept.:
>>>> Hello,
>>>> Mainly there are two possible NAND chips for RB450G:
>>>> W29N04GVSIAA (see attachment);
>>>> Can you provide some device serial numbers?
>>>> Checking the datasheets from both chips mentioned above shows they are both SLC flash.
>>>> @Christian, do you have this board?  Could you provide the serial?
>>> No, luckily it isn't one of mine. But Mikrotik want to dig: I do have a serial number
>>> of an affected board: 1DFC018EF642.
>> Just got a reply back from Mikrotik.
>> It seems this serial is not correct?
> Ok, I found a better barcode reader. It says:
> 1DFC018FF642
> The eight letter changed from E to F. Maybe the two reporters
> (FS#1778 and FS#1830) can also provide their Serial #? But I don't
> think they know of this thread?
Thanks again, I've passed it on to the Mikrotik guy
> The question is: What would the serial # actually solve? I don't
> think MikroTik would replace the affected boards. After all they
> are "supported" by their own MikoTik RouterOS. And I don't think
> that UBI will run on MLC in the near future either [0].

I'm also not expecting any real solution to pop out here .. but:

- I still want to confront them with these findings and see what the 
response is.  It doesn't take much efforts from our side to shake the 
tree a bit over there
- The guy probably needs a serial to enter in his ticket system .. so I 
want to stay polite here and provide one :-)

> Maybe there is a way out: The RB450G has a microsd slot. So the
> rootfs could be placed on it. Of course, this requires a specialized
> image as well as built-in support for CONFIG_MMC_SPI to read from
> the sd-card. And for the overlay: CONFIG_BLK_DEV_LOOP, the F2FS
> (CONFIG_F2FS_FS, ...) and EXT4 (CONFIG_EXT4_FS) as well as
> userspace tools: mkf2fs e2fsprogs f2fsck e2fsprogs.
> (Along with a cmdline patch to tell the kernel that the rootfs is
> located on the mmc)
> Regards,
> Christian
> [0] Micron has released a few patches over the years that *they
> "think"* are necessary for getting UBI to work on their own MLC
> NAND Flash: You can find the patches (most of them dated from
> 2016) here:
> <https://www.micron.com/products/nand-flash/mlc-nand/mlc-nand-software>
> |1. mtd-nand-use-a-lower-value-for-badblockbits-when-working-with-MLC-NAND
> |	MLC NANDs have more bit flips that SLC. When looking for bad block marker
> |	we have a lot of false positive if we check for the whole byte. To avoid
> |	this tolerate a few (4 here) bit flips for byte.
> |
> |2. fixup-ubi-cannot-recover-master node issue
> |	For MLC NAND, paired page issue is now a common known issue. This patch
> |	is just for master node cannot be recovered while there will two pages
> |	be dameged in one single master block.
> |
> |3. UBI power loss issue for paired page Ver2.0
> |	These patches aim to solve MLC NAND paired page power loss issue by
> |	adding a bakvol (backup volume) module in UBI layer. 70 series and
> |	80 series families.
> Micron' engineers also tried to "upstream" the patches but didn't succeed.
Thanks a lot for the digging. Interesting stuff.

My opinion here would be to stick to upstream as much as possible .. and 
hope that someday it will get officially supported.
Judging by the fact that the global market is rapidly moving towards 
MLC,TLC,QLC .. it will be interesting to see how upstream will cope with 
this ..

Nonetheless, I still prefer to get this patch in for at least 18.06, 
marking it with a big warning on boot that stability is not guaranteed 
at all.
I don't think people's device should be bricked by upgrading with a 
point release.
A few ack's would be appreciated here.


openwrt-devel mailing list
openwrt-devel at lists.openwrt.org

More information about the openwrt-devel mailing list