[PATCH V3 1/3] base-files: sysupgrade: add tar.sh with helpers for building archives
Luiz Angelo Daros de Luca
luizluca at gmail.com
Thu Feb 29 20:54:18 PST 2024
Hello Rafał,
Sorry for the late review.
I know this will be used for a limited context but it can be
constructed in such a way it could be used for more cases without
increasing its complexity.
> From: Jo-Philipp Wich <jo at mein.io>
>
> This allows building uncompressed tar archives from shell scripts (and
> compressing them later if needed)
>
> Signed-off-by: Rafał Miłecki <rafal at milecki.pl>
> ---
> V2: Simplify dd in __tar_print_padding (I still think helper is useful)
> Hardcode 0/0/ root/root for now as most likely it'll be enough
> Simplify name validation (leasing slash)
> Reorder some variables
> V3: Fix dd in __tar_print_padding
> Rename functions
> Drop unused functions
> Document usage
>
> package/base-files/files/lib/upgrade/tar.sh | 71 +++++++++++++++++++++
> 1 file changed, 71 insertions(+)
> create mode 100644 package/base-files/files/lib/upgrade/tar.sh
>
> diff --git a/package/base-files/files/lib/upgrade/tar.sh b/package/base-files/files/lib/upgrade/tar.sh
> new file mode 100644
> index 0000000000..a9d1d559e6
> --- /dev/null
> +++ b/package/base-files/files/lib/upgrade/tar.sh
> @@ -0,0 +1,71 @@
> +# SPDX-License-Identifier: GPL-2.0-or-later OR MIT
> +
> +# Example usage:
> +#
> +# {
> +# tar_print_member "date.txt" "It's $(date +"%Y")"
> +# tar_print_trailer
> +# } > test.tar
> +
> +__tar_print_padding() {
> + dd if=/dev/zero bs=1 count=$1 2>/dev/null
> +}
> +
> +tar_print_member() {
> + local name="$1"
> + local content="$2"
> + local mtime="${3:-$(date +%s)}"
> + local mode=644
> + local uid=0
> + local gid=0
> + local size=${#content}
> + local type=0
> + local link=""
> + local username="root"
> + local groupname="root"
> +
> + # 100 byte of padding bytes, using 0x01 since the shell does not tolerate null bytes in strings
> + local pad=$'\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1'
\1 is still a valid filename character. Shell cannot save a string
with \0 but printf can handle it nicely (as well as pipes). If you
build the header using a function at the moment you use it, you could
avoid the \1 hack.
print_header() {
local filename=$1
local mode=$2
...
local checksum=$7
...
printf '%s' "$filename"; printf '\x00%.0s' $(seq $((100 - ${#filename})))
...
}
print_header "$filename" "$mode" \
"$uid" "$gid" .. | hexdump -ve '1/1 "%u
If you don't like the printf trick to repeat a char, you could use
__tar_print_padding (or use printf for both).
You could call it twice, first without a checksum to calculate it and
then a new one to generate the final output.
> +
> + # validate name (strip leading slash if present)
> + name=${name#/}
> +
> + # truncate string header values to their maximum length
> + name=${name:0:100}
It feels strange to simply truncate the name. If you need to truncate
it, you are implying it might be larger. In that case, it should fail
if it cannot fit a specific filename. Otherwise, you could end up
overwriting two filenames with the same first 100 bytes. And BTW,
shouldn't ustar support 255 or 256 filenames?
> + link=${link:0:100}
> + username=${username:0:32}
> + groupname=${groupname:0:32}
> +
> + # construct header part before checksum field
> + local header1="${name}${pad:0:$((100 - ${#name}))}"
> + header1="${header1}$(printf '%07d\1' $mode)"
> + header1="${header1}$(printf '%07o\1' $uid)"
> + header1="${header1}$(printf '%07o\1' $gid)"
> + header1="${header1}$(printf '%011o\1' $size)"
> + header1="${header1}$(printf '%011o\1' $mtime)"
> +
> + # construct header part after checksum field
> + local header2="$(printf '%d' $type)"
> + header2="${header2}${link}${pad:0:$((100 - ${#link}))}"
> + header2="${header2}ustar ${pad:0:1}"
> + header2="${header2}${username}${pad:0:$((32 - ${#username}))}"
> + header2="${header2}${groupname}${pad:0:$((32 - ${#groupname}))}"
It seems to be a strange ustar format. Shouldn't it have an ustar
version before the username? And Device major/minor after groupname,
followed by more 150 filename prefix? It might still work if the
reader interprets it as a v7.
> + # calculate checksum over header fields
> + local checksum=0
> + for byte in $(printf '%s%8s%s' "$header1" "" "$header2" | tr '\1' '\0' | hexdump -ve '1/1 "%u "'); do
> + checksum=$((checksum + byte))
> + done
> +
Maybe use a single line like:
checksum=$(print header... | hexdump -ve '1/1 "%u\n"' | awk '{s+=$1}
END {print s}')
?
> + # print member header, padded to 512 byte
> + printf '%s%06o\0 %s' "$header1" $checksum "$header2" | tr '\1' '\0'
> + __tar_print_padding 183
> +
> + # print content data, padded to multiple of 512 byte
> + printf "%s" "$content"
> + __tar_print_padding $((512 - (size % 512)))
> +}
> +
> +tar_print_trailer() {
> + __tar_print_padding 1024
> +}
> --
> 2.35.3
>
More information about the openwrt-devel
mailing list