[PATCH] busybox: tr: enable options required by POSIX

Jordan Geoghegan jordan at geoghegan.ca
Tue Jul 14 02:06:30 EDT 2020



On 2020-07-13 22:56, Rosen Penev wrote:
> On Mon, Jul 13, 2020 at 10:44 PM Jordan Geoghegan <jordan at geoghegan.ca> wrote:
>>
>>
>> On 2020-07-13 22:17, Rosen Penev wrote:
>>> On Mon, Jul 13, 2020 at 12:14 PM Jordan Geoghegan <jordan at geoghegan.ca> wrote:
>>>>
>>>> On 2020-07-13 08:36, Petr Štetiar wrote:
>>>>> Magnus Kroken <mkroken at gmail.com> [2020-07-13 15:49:30]:
>>>>>
>>>>> Hi,
>>>>>
>>>>>> Support for character classes (e.g. [:upper:] and [:lower:]) and
>>>>>> equivalence classes (e.g. [=a=]) in the tr utility are required by POSIX.
>>>>>> This change increases package size by approx. 500 bytes.
>>>>> where does OpenWrt claims, that it's fully POSIX compliant? Some deviations
>>>>> are expected from the standards in exchange for lower flash usage. Maybe it
>>>>> could be considered as `default y if !SMALL_FLASH` for devices with more flash
>>>>> space, but then we would probably get inconsistent behaviour across various
>>>>> targets and scripts wouldn't use this classes anyway.
>>>>>
>>>>> So I don't see anything in favor for this patch inclusion.
>>>>>
>>>>> -- ynezz
>>>> Hi Petr,
>>>>
>>>> Not sure if you've had a chance to read through the earlier discussion
>>>> about this, so I will reiterate my point a bit below
>>>>
>>>> On OpenWRT 'tr' is configured to silently ignore character classes and
>>>> treat all characters literally, which is the most dangerous kind of
>>>> deviation from norm, as it is does something non-standard without
>>>> informing the user. That alone seems to strongly put this in favour of
>>>> being included. Even if it is decided to deviate from the standard and
>>>> ignore character classes, there should at the very least be an
>>>> error/warning printed.
>>> Got any example of this being problematic currently?
>> A quick grep of the source tree shows there's already things relying on
>> classes that aren't actually working correctly:
>>
>> ryzen$ rg "tr '\[:"
>> scripts/mkits.sh
>> 59:ARCH_UPPER=$(echo "$ARCH" | tr '[:lower:]' '[:upper:]')
>>
>> I also grepped the package/ports tree and found a number of issues,
>> namely, using double brackets "[[" is a no-no with tr, as the extra
>> brackets are treated literally, as well as '[A-Z]' is also a bug, as the
>> brackets are unnecessary and are treated literally.
>>
>> ryzen$ rg "tr '\["
>> utils/lxc/patches/010-Remove-distro-check.patch
>> 43:-with_distro=`echo ${with_distro} | tr '[[:upper:]]' '[[:lower:]]'`
>>
>> sound/shairport-sync/patches/010-no-cxx.patch
>> 28:@@ -19,7 +19,6 @@ with_os=`echo ${with_os} | tr '[[:upper:]]'
>> '[[:lower:]]' `
>>
>> utils/pciutils/patches/105-fix-host.patch
>> 7:-host=`echo $HOST | sed -e
>> 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e
>> 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\2/' -e
>> 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'`
>> 8:+host=`echo $HOST | sed -e
>> 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e
>> 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'`
>>
>> net/ser2net/files/ser2net.init
>> 28:    [ "$uc" -eq 1 ] && key=`echo "$key" | tr '[a-z]' '[A-Z]'`
>> 120:    parity=`echo "$parity" | tr '[a-z]' '[A-Z]'`
> All of those examples except for the last one are for the host.
Either way, those bugs I mentioned should be addressed, as they are 
indeed errors. Also, the use of "tr 'a-z'..." is unsafe for exotic 
locales as mentioned earlier in the thread. What makes all this so 
dangerous is that no error is printed, it silently and sneakily does the 
exact opposite of what you would expect.
>>>> The question being asked is, is saving 500 bytes worth a tremendous
>>>> deviation from the norm, and rendering a standard tool essentially
>>>> useless (with a built-in foot gun to boot!)
>>> tr is used in the tree for more than this.
>> My point still stands.
> The issue is that it does not solve an issue that is currently present.
It does solve an issue that is currently present, what do you think 
brought me here? This all started because a couple of my scripts blew up 
(scripts that are highly portable and run on a multitude of systems, for 
example Linux, MacOS, OpenBSD, FreeBSD, NetBSD, DragonflyBSD etc. 
Everything worked on OpenWRT, except for 'tr' working unlike any other 
'tr' I've encountered (other than ancient 20th century Unixen).
>>>> Regards,
>>>>
>>>> Jordan
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> openwrt-devel mailing list
>>>> openwrt-devel at lists.openwrt.org
>>>> https://lists.openwrt.org/mailman/listinfo/openwrt-devel




More information about the openwrt-devel mailing list