Improve precision for Sinh and Cosh TP routines #99763

eiriktsarpalis · 2024-03-14T13:50:12Z

The amd algorithm for cosh employs fallback to scalar for values exceeding a specified threshold. This was missed in the TP transcription resulting in loss of precision when passing in large inputs.

Contributes to #98861.

eiriktsarpalis · 2024-03-14T13:51:39Z

...braries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.Cosh.cs

            private const float Single_LOGV = 0.693161f;
            private const float Single_HALFV = 1.0000138f;
            private const float Single_INVV2 = 0.24999309f;

+            private const ulong Double_MAX_VECTORIZED_VALUE = 0x41600000000000ul;


NB there's no double implementation in the amd library, I picked this value deriving from upper bounds used elsewhere and make it small enough such that all our tests pass.

This is not a good value, it represents 1.9330329145781312E-307 which is so small that essentially no real world input will be vectorized.

I'd expect a more realistic value to at least match float and be 0x4056_5A9F_8000_0000, although in practice I'd expect it to be quite a bit higher. If you note the scalar cosh algorithm, you'll note that double uses: 0x4086_33CE_8FB9_F87E, which is the bitwise representation for 710.475860073944

In general, this is a case where porting the scalar kernel is likely overall better. The biggest "blocker" to that is that the double version currently uses a lookup table for the tail adjustment, which tends to not be trivially or efficiently vectorizable. The float version does not and uses more branches instead.

-- There are options here, just not "trivial" options, and the PR would be more involved. So for the time being, I think simply matching the "correct" arg max is the more desirable behavior.

That's my bad, I was toying around with the binary representation of the upper bound to find something that passes and hadn't realized that it ended up being so small! After more careful experimentation it turns out that the maximal cut-off within which we hit our precision targets for double is ~16.5, much lower compared to the float algorithm. Assuming we don't change the algorithm I see two options here:

Use the reduced cut-off of 16.5 or

Increase or remove the cut-offs and increase testing tolerances for the hyperbolic methods.

I think we really need to use the normal cutoff here, otherwise users will end up seeing major perf slowdowns for somewhat typical inputs.

The main reason for the precision issue is going to come about from LOGV, HALFV, and INVV2 being "exact" (or nearest representable). I had given a rough breakdown of where the values came from in #97874 (comment), but didn't end up having time to figure out the right amount of adjustment to give to the values for double.

If you observe https://github.com/amd/aocl-libm-ose/blob/master/src/optimized/cosh.c, you'll note that values over 20 get simplified because the negative exponential from (e^x + e^-x) / 2 is small enough to not matter any more (we just do exp(|x|) / 2 with the relevant sign from x while values under this range end up using the head/tail split to account for imprecision).

Given the above and given that we need to call exp regardless, we should be able to conditionalize the work to keep it accurate and efficient.

Ok, in that case I'd be inclined to close this PR.

dotnet-policy-service · 2024-03-14T13:52:43Z

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

stephentoub · 2024-03-14T14:00:10Z

The amd algorithm for cosh employs fallback to scalar for values exceeding a specified threshold. This was missed in the TP transcription

It wasn't "missed"... it was removed in 0e6a327. But I don't remember why.

eiriktsarpalis · 2024-03-14T14:05:18Z

But I don't remember why.

Maybe because it was missing a double counterpart at the time? That's where the precision issues manifest in our testing.

src/libraries/System.Numerics.Tensors/tests/TensorPrimitivesTests.cs

…sts.cs

Improve precision for Sinh and Cosh TP routines

660e597

dotnet-issue-labeler bot added the area-System.Numerics label Mar 14, 2024

eiriktsarpalis self-assigned this Mar 14, 2024

eiriktsarpalis requested review from stephentoub and tannergooding March 14, 2024 13:50

eiriktsarpalis added this to the 9.0.0 milestone Mar 14, 2024

eiriktsarpalis commented Mar 14, 2024

View reviewed changes

Remove autocompletion artifacts.

4ead9ea

Remove unnecessary cast

2ad2887

build-analysis bot mentioned this pull request Mar 14, 2024

System.DirectoryServices.Protocols.Tests.BerConverterTests.Decode_Bytes_ReturnsExpected failing on Windows x86 #99725

Closed

eiriktsarpalis commented Mar 14, 2024

View reviewed changes

src/libraries/System.Numerics.Tensors/tests/TensorPrimitivesTests.cs Outdated Show resolved Hide resolved

eiriktsarpalis added 4 commits March 14, 2024 19:57

Update src/libraries/System.Numerics.Tensors/tests/TensorPrimitivesTe…

8adcf54

…sts.cs

Merge branch 'main' into fix-hyperbolic-precision

2ebc44a

Revise double cutoff for hyperbolic vectorization.

994ae32

Merge branch 'main' into fix-hyperbolic-precision

06e348f

eiriktsarpalis closed this Mar 15, 2024

eiriktsarpalis mentioned this pull request Mar 15, 2024

Revise a number of TP precision tolerances #99831

Merged

github-actions bot locked and limited conversation to collaborators Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve precision for Sinh and Cosh TP routines #99763

Improve precision for Sinh and Cosh TP routines #99763

eiriktsarpalis commented Mar 14, 2024

eiriktsarpalis Mar 14, 2024

tannergooding Mar 15, 2024

eiriktsarpalis Mar 15, 2024 •

edited

Loading

tannergooding Mar 15, 2024

eiriktsarpalis Mar 15, 2024

dotnet-policy-service bot commented Mar 14, 2024

stephentoub commented Mar 14, 2024

eiriktsarpalis commented Mar 14, 2024

Improve precision for Sinh and Cosh TP routines #99763

Improve precision for Sinh and Cosh TP routines #99763

Conversation

eiriktsarpalis commented Mar 14, 2024

eiriktsarpalis Mar 14, 2024

Choose a reason for hiding this comment

tannergooding Mar 15, 2024

Choose a reason for hiding this comment

eiriktsarpalis Mar 15, 2024 • edited Loading

Choose a reason for hiding this comment

tannergooding Mar 15, 2024

Choose a reason for hiding this comment

eiriktsarpalis Mar 15, 2024

Choose a reason for hiding this comment

dotnet-policy-service bot commented Mar 14, 2024

stephentoub commented Mar 14, 2024

eiriktsarpalis commented Mar 14, 2024

eiriktsarpalis Mar 15, 2024 •

edited

Loading