Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more multiplication primitives #107

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

jamierpond
Copy link
Collaborator

  • add main logic
  • add even lane mask
  • even/odd lane mask
  • clean up a little

@jamierpond jamierpond requested a review from thecppzoo January 13, 2025 02:35

using S = SWAR<4, u32>;

static_assert(S::oddLaneMask().value() == 0xF0F0'F0F0);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aware these tests not formatted nicely, just making a draft for visibility

Copy link
Owner

@thecppzoo thecppzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this draft.
This work surfaces several important questions:

  1. Support for non-power-of-two lane sizes
  2. What are we going to do with two's complement signs? perhaps this is not fullMultiplication but safeMultiplication
  3. We have to implement "negation" (two's complement flipping the sign)

That being said, we are in an excellent position to also implement division as multiplication by the reciprocal, which would be useful at least for compile-time divisors.

Please merge the auto declarations: this does not coerce the type, but verifies that all the declarands have the same type:

auto a = initialize_a(inputsForA);
auto b = initialize_b(inputsForB);

In that code, the types are not coerced (very good!) but a and b may be of different types.

auto
    a = initA(iA),
    b = initB(iB);

We still don't coerce the types, but if a and b have different types, it is a compilation error.
This is especially useful here in the SWAR library

Comment on lines 516 to 519
auto [l_even, l_odd] = doublePrecision(multiplicand);
auto [r_even, r_odd] = doublePrecision(multiplier);
auto res_even = multiplication_OverflowUnsafe(l_even, r_even);
auto res_odd = multiplication_OverflowUnsafe(l_odd, r_odd);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge these declarations into a single auto, the idea is that in that way you are verifying they are all of the same type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also todo signed multiplication

@thecppzoo
Copy link
Owner

Perhaps we can also make "widening multiplication", that doubles the lane size. For example, in x86-64, there are the instructions to multiply two register-size values and get a result of double the number of bits, using the "DX:AX" for the result, so, for 64 bits, it would be RDX with the upper 64 bits, and RAX with the lower, in this way, the multiplication also widens. Ask Claude what is the name of this.

@jamierpond jamierpond marked this pull request as ready for review January 15, 2025 01:39
@jamierpond jamierpond changed the title Draft: fullMultiplication Add more multiplication primitives Jan 15, 2025
@jamierpond jamierpond requested a review from thecppzoo January 16, 2025 23:28
Comment on lines 478 to 479
SWAR<NB, T> result;
SWAR<NB, T> overflow;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not overflow.


template <int NB, typename T>
constexpr auto
doublingMultiplication(SWAR<NB, T> multiplicand, SWAR<NB, T> multiplier) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doubling is confusing here. doublePrecisionMultiplication is fine, multiplicationByDoublingPrecision, ...

}

template <int NB, typename T>
constexpr MultiplicationResult<NB, T>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the explicit return type?

Comment on lines 524 to 526
template<int NB, typename T>
constexpr auto saturatingExponentiation(
SWAR<NB, T> x,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely not.
We're not removing the non-saturating exponentiation and provide only the saturating exponentiation. Don't do that.
Always the general operation is pre-requisite for the more specific.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call

Comment on lines +477 to +478
SWAR<NB, T> lower;
SWAR<NB, T> upper;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge

Comment on lines 503 to 504
over_even = D{(lower.value() & UpperHalfOfLanes) >> HalfLane},
over_odd = D{(upper.value() & UpperHalfOfLanes) >> HalfLane};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shift intra lane allows you to provide the mask.
Please use those primitives instead of deploying the pick-axe

Comment on lines 261 to 267
template <int NBits, typename T>
constexpr static auto consumeMSB(SWAR<NBits, T> s) noexcept {
using S = SWAR<NBits, T>;
auto msbCleared = s & ~S{S::MostSignificantBit};
return S{static_cast<T>(msbCleared.value() << 1)};
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sold on promoting this to the main header of swar.
This really seems to be an artifact of the "regressive" direction of "associative iteration", it does not cohere enough to the SWAR library itself.

auto
doublePrecisionMultiplication(SWAR<NB, T> multiplicand, SWAR<NB, T> multiplier) {
auto
icand = doublePrecision(multiplicand),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! never thought about omitting the prefix

@thecppzoo
Copy link
Owner

I can not resist to comment about how elegant this is all looking.
The primitives of doubling/halving precision were a success.

@jamierpond jamierpond requested a review from thecppzoo February 25, 2025 08:08
@jamierpond jamierpond force-pushed the jp/overflow-multi-safe-clean branch from 9fb833d to dea354a Compare February 25, 2025 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants