printf: `%a` output is different from coreutils #7364

drinkcat · 2025-02-26T19:22:37Z

%a should output "Hexadecimal floating point, lowercase"

After fixing #7362, we still see some issues.

It seems like GNU coreutils prefers "shifting" the output so that we have a single hex digit between 0x1 and 0xf before the decimal point, while uutils always picks 0x1. And the output is padded with 0.

$ cargo run printf "%a %a\n" 15.125 16.125
0x1.e400000000000p+3 0x1.0200000000000p+4
$ printf "%a %a\n" 15.125 16.125
0xf.2p+0 0x8.1p+1

The value is technically correct though:

0x1.e400000000000p+3 (1+14/16+4/256)*2**3=15.125
0x1.0200000000000p+4 (1+2/256)*2**4=16.125

(note: be careful to add env before printf as some shell implementations provide built-in printf...)

Also, the behaviour is different across platforms. Running LANG=C env printf '%a %.6a\n' 0.12544 0.12544 in various dockers (gist):

arch	`%a`	`%.6a`
`linux-386`/`linux-amd64`	`0x8.07357e670e2c12bp-6`	`0x8.07357ep-6`
`linux-arm-v5`/`linux-arm-v7`	`0x1.00e6afcce1c58p-3`	`0x1.00e6b0p-3`
`linux-arm64-v8`/`linux-mips64le`/`linux-ppc64le`/`linux-s390x`	`0x1.00e6afcce1c58255b035bd512ec7p-3`	`0x1.00e6b0p-3`

According to https://en.cppreference.com/w/c/io/fprintf: The default precision is sufficient for exact representation of the value..

On x86, 16 nibbles = 64 bits are printed at most, including the integer part. That corresponds to the internal x86 80-bit floating point, long double type. printf shifts 3 of the fraction bits in the integer part before the ., so that the whole 64 bits can fit neatly in 16 nibbles when printed. It's interesting that this behaviour is preserved when specifying a precision (e.g. %.6f).

On arm64 (and a bunch of other archs): 28 nibbles = 112 bits are printed after the decimal point. That corresponds to quad-precision 128-bit float. Also long double type.

On arm32: 13 nibbles = 52 bits are printed. That's double-precision 64-bit float. Also long double type.

The text was updated successfully, but these errors were encountered:

tertsdiepraam · 2025-02-26T23:12:11Z

Yeah this was one of the shortcuts I took while implementing this. I had to do a big refactor and simplified it to using 1 as the first digit.

According to https://en.cppreference.com/w/c/io/fprintf: The default precision is sufficient for exact representation of the value..

While that is a pretty good reference, note that coreutils has a custom implementation that differs slightly in a few places. I might be misremembering (it's a while back), but I think that some C implementation also do the thing where they always use 1 as the first digit.

Also, if I recall correctly, GNU makes a distinction between a default precision and a precision that was specified of the same length, which explains the difference in precision that you're seeing.

drinkcat · 2025-02-28T14:54:57Z

I understand one goal of this project is to exactly match the output of GNU coreutils for compatibility? Is that correct?

(for context... I actually bumped into this in an attempt to debug further what's going on in #5759...)

So, there's at least 2 issues here with %a...

First, GNU coreutils appear to always pack 4 bits in the first hex digit (0x8->0xf), on x86(64). But only 1 bit (0x1) on arm platforms. No matter the specified precision. (yes, printf built-in bash -- when run as sh, only packs 1 bit on x86-64 as well)

Second, GNU coreutils appears to use long double on all platforms, that's either 64-bit, 80-bit, or 128-bit float (on arm32, x86(64), and arm64 respectively), and uses But uutils uses f64 for %a printf. So we're too low in precision (on anything but arm32). I wonder if we should switch to BigDecimal here too (like we do in seq).

Generally, do we need to detect the architecture (I assume uutils wants to target at least x86-64 and arm64?), and adjust the number of bits to pack in the first hex digit, then trim the precision from BigDecimal to whatever the long double would be on the given platform?

tertsdiepraam · 2025-02-28T14:58:59Z

I understand one goal of this project is to exactly match the output of GNU coreutils for compatibility? Is that correct?

Ah yes, don't take my previous comment as discouraging you, it was meant as quite the opposite!

tertsdiepraam · 2025-02-28T15:06:41Z

A small note though. I think you might run into difficulties with long double and it's platform-specificness. You could also argue that uutils emulates the behaviour of some specific architecture, in some sense that actually makes it more portable. I think uutils currently does very few modifications that take architecture differences so seriously. I think that part of uutils compatibility is essentially undefined right now.

drinkcat mentioned this issue Feb 26, 2025

printf: (partially) fix hex format: exponent is decimal, correctly print negative numbers #7365

Merged

cakebaker added the U - printf label Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

printf: `%a` output is different from coreutils #7364

printf: `%a` output is different from coreutils #7364

drinkcat commented Feb 26, 2025 •

edited

Loading

tertsdiepraam commented Feb 26, 2025

drinkcat commented Feb 28, 2025 •

edited

Loading

tertsdiepraam commented Feb 28, 2025

tertsdiepraam commented Feb 28, 2025

printf: %a output is different from coreutils #7364

printf: %a output is different from coreutils #7364

Comments

drinkcat commented Feb 26, 2025 • edited Loading

tertsdiepraam commented Feb 26, 2025

drinkcat commented Feb 28, 2025 • edited Loading

tertsdiepraam commented Feb 28, 2025

tertsdiepraam commented Feb 28, 2025

printf: `%a` output is different from coreutils #7364

printf: `%a` output is different from coreutils #7364

drinkcat commented Feb 26, 2025 •

edited

Loading

drinkcat commented Feb 28, 2025 •

edited

Loading