-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more multiplication primitives #107
base: master
Are you sure you want to change the base?
Conversation
jamierpond
commented
Jan 13, 2025
- add main logic
- add even lane mask
- even/odd lane mask
- clean up a little
inc/zoo/swar/associative_iteration.h
Outdated
|
||
using S = SWAR<4, u32>; | ||
|
||
static_assert(S::oddLaneMask().value() == 0xF0F0'F0F0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aware these tests not formatted nicely, just making a draft for visibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this draft.
This work surfaces several important questions:
- Support for non-power-of-two lane sizes
- What are we going to do with two's complement signs? perhaps this is not
fullMultiplication
butsafeMultiplication
- We have to implement "negation" (two's complement flipping the sign)
That being said, we are in an excellent position to also implement division as multiplication by the reciprocal, which would be useful at least for compile-time divisors.
Please merge the auto
declarations: this does not coerce the type, but verifies that all the declarands have the same type:
auto a = initialize_a(inputsForA);
auto b = initialize_b(inputsForB);
In that code, the types are not coerced (very good!) but a
and b
may be of different types.
auto
a = initA(iA),
b = initB(iB);
We still don't coerce the types, but if a
and b
have different types, it is a compilation error.
This is especially useful here in the SWAR library
inc/zoo/swar/associative_iteration.h
Outdated
auto [l_even, l_odd] = doublePrecision(multiplicand); | ||
auto [r_even, r_odd] = doublePrecision(multiplier); | ||
auto res_even = multiplication_OverflowUnsafe(l_even, r_even); | ||
auto res_odd = multiplication_OverflowUnsafe(l_odd, r_odd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge these declarations into a single auto
, the idea is that in that way you are verifying they are all of the same type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also todo signed multiplication
Perhaps we can also make "widening multiplication", that doubles the lane size. For example, in x86-64, there are the instructions to multiply two register-size values and get a result of double the number of bits, using the "DX:AX" for the result, so, for 64 bits, it would be RDX with the upper 64 bits, and RAX with the lower, in this way, the multiplication also widens. Ask Claude what is the name of this. |
inc/zoo/swar/associative_iteration.h
Outdated
SWAR<NB, T> result; | ||
SWAR<NB, T> overflow; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not overflow.
inc/zoo/swar/associative_iteration.h
Outdated
|
||
template <int NB, typename T> | ||
constexpr auto | ||
doublingMultiplication(SWAR<NB, T> multiplicand, SWAR<NB, T> multiplier) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doubling
is confusing here. doublePrecisionMultiplication
is fine, multiplicationByDoublingPrecision
, ...
inc/zoo/swar/associative_iteration.h
Outdated
} | ||
|
||
template <int NB, typename T> | ||
constexpr MultiplicationResult<NB, T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the explicit return type?
inc/zoo/swar/associative_iteration.h
Outdated
template<int NB, typename T> | ||
constexpr auto saturatingExponentiation( | ||
SWAR<NB, T> x, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely not.
We're not removing the non-saturating exponentiation and provide only the saturating exponentiation. Don't do that.
Always the general operation is pre-requisite for the more specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call
SWAR<NB, T> lower; | ||
SWAR<NB, T> upper; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge
inc/zoo/swar/associative_iteration.h
Outdated
over_even = D{(lower.value() & UpperHalfOfLanes) >> HalfLane}, | ||
over_odd = D{(upper.value() & UpperHalfOfLanes) >> HalfLane}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shift intra lane allows you to provide the mask.
Please use those primitives instead of deploying the pick-axe
inc/zoo/swar/SWAR.h
Outdated
template <int NBits, typename T> | ||
constexpr static auto consumeMSB(SWAR<NBits, T> s) noexcept { | ||
using S = SWAR<NBits, T>; | ||
auto msbCleared = s & ~S{S::MostSignificantBit}; | ||
return S{static_cast<T>(msbCleared.value() << 1)}; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sold on promoting this to the main header of swar.
This really seems to be an artifact of the "regressive" direction of "associative iteration", it does not cohere enough to the SWAR library itself.
auto | ||
doublePrecisionMultiplication(SWAR<NB, T> multiplicand, SWAR<NB, T> multiplier) { | ||
auto | ||
icand = doublePrecision(multiplicand), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! never thought about omitting the prefix
I can not resist to comment about how elegant this is all looking. |
9fb833d
to
dea354a
Compare