-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop Sign, Unsign, Nan & Inf for FixedDecimal #5065
Comments
pub struct Signed<T> {
pub sign: Sign,
pub value: T,
}
pub enum WithInfinity<T> {
Infinity,
Finite(T),
}
pub enum WithNaN<T> {
NaN,
N(T),
}
/// Maybe:
pub struct WithCompactExponent<T> {
pub exponent: u8,
pub significand: T,
}
pub struct WithScientificExponent<T> {
pub exponent: i16,
pub significand: T,
}
pub type SignedFixedDecimal = Signed<FixedDecimal>;
pub type FixedDecimalOrInfinity = WithInfinity<FixedDecimal>;
pub type SignedFixedDecimalOrInfinity = Signed<FixedDecimalOrInfinity>;
Signed { sign: Neg, value: WithInfinity::Infinity } // \neg\inf
pub type SignedFixedDecimalOrInfinityOrNan = WithNaN<SignedFixedDecimalOrInfinity>; Over FFI we would have to
|
Probably a duplicate of #862? Though, this has a more fleshed out design, so I'd say it superseeds it in a way. |
Considering the design above, what would be the story around the rounding functions if Also, would we add rounding methods to |
@younies Would be good to get this into 2.0, but we won't block on it. Please prioritize if you need it |
2024-10-29 discussion: // Option 1
pub struct FixedInteger(SignedFixedDecimal);
// this is just:
pub struct FixedInteger(Signed<UnsignedFixedDecimal>);
// so we should probably have the `Signed` on the outside? pub struct Signed<T> {
decimal: T,
sign: Sign,
}
// Option 1:
pub struct FixedDecimal {
integer: Integer,
upper_magnitude: i16,
lower_magnitude: i16,
}
pub struct Integer {
digits: SmallVec<[u8; 8]>,
// keep this an i16 with the current invariants (largest significant digit)
magnitude: i16,
}
// Option 2:
pub struct FixedDecimal {
integer: Integer,
// number of digits to shift down
shift: u16,
}
pub struct Integer {
digits: SmallVec<[u8; 8]>,
// change the definition of `magnitude` to be u16 of the lowest significant digit
magnitude: u16,
upper_magnitude: u16,
} Example 1123,000,000,000 Current i16 model (magnitude is most significant digit): Proposed u16 model (magnitude is least significant digit): Example 20012.3400 Current i16 model: Proposed u16 model: Game PlanCurrent PR: |
Rounding modes: UnsignedRoundingMode {
Expand,
Trunc,
HalfExpand,
HalfTrunc,
HalfEven
}
SignedRoundingMode {
Unsigned(UnsignedRoundingMode),
Ceil,
Floor,
HalfCeil,
HalfFloor
} |
Another composition idea: make the base be called Then we can eventually end up with: pub struct Natural {
digits: SmallVec<[u8; 8]>,
// magnitude of the lowest significant digit
magnitude: u16,
// number of leading zeros (TODO: u8 or u16?)
leading_zeros: u8,
}
pub struct Decimal {
natural: Natural,
// rightward shift of the integer
shift: u16,
} Just to note, the equations I think end up being:
|
I think it's fine to use the same rounding mode enum for both signed and unsigned modes, and the reason is because applying floor or ceil on an unsigned number doesn't do anything unexpected or wrong, it's just a bit superfluous. |
Three reasons why I think we should split the rounding mode enum:
|
That's just incorrect. Unsigned decimals are a subset of real numbers. |
What I meant was, the domain of real numbers is approximated by a signed decimal, but an unsigned decimal approximates only half of the domain. |
Yes, but my original argument is that any function defined in ℝ is also defined in ℝ⁺.
Implementation details shouldn't really matter at all. If we cared about internal readability, we wouldn't have used the
This is fair, but I think the same footgun can occur in so many different ways (e.g. round first then call |
For now, we have implemented two enums: |
I can see your perspective on that point
Whether or not a function exists on a certain type is the easiest, most effective, and cleanest way to nudge developers to do the right thing. I don't see how missing a function makes the API easier or harder to use. If you try calling
I wasn't making an argument about internal readability; I was making an argument about being surprising that |
Yes, I think we shouldn't have a |
One more thing: ECMA-402 actually defines an Unsigned Rounding Mode enumeration https://tc39.es/ecma402/#sec-getunsignedroundingmode It uses different names:
So one option is we could have |
What exactly are the use cases for NaN and Infinity? I've long been of the belief that FixedDecimal was best just leaving them out and asking user code to handle these cases. Unfortunately this makes code more complicated and even experienced programmers don't really know what the right thing is to do, so all previous i18n libraries to my knowledge have handled NaN and Infinity internally. What I'm sort of getting at is maybe we can design a thinner API with the primary goal of developer economics but otherwise not doing anything else special with these values. |
If that's the idea, I would support having a separate formatter (or even just a databag to fetch the localized versions of those) for only the special cases of |
I agree with this. I think we could have separate formatters, but currently CLDR data for Infinity is just always ∞, and CLDR data for NaN is mostly "NaN", with some manual translations which many of us feel is actually counterproductive. We already were discussing normalizing to the string "NaN". I think it seems fine for implementations to wrap around FixedDecimalFormatter and hardcode ∞ and NaN. If we want we can provide an end-to-end |
So it seems like ICU4X-TC's general opinion around this, as well as that of CLDR's design-wg, is that we should hardcode "NaN" and ∞, and potentially provide specialized formatters for just those strings in the future. Given that, I propose this:
One small missing thing is Thoughts? |
WG discussion:
// This type is basically what ECMA-402 needs
let with_nan_with_inf = WithNan::<Signed<WithInfinity<UnsignedFixedDecimal>>::try_from_f64(x);
// These are nicer types that don't propagate NaNs
let with_inf: Signed<WithInfinity<UnsignedFixedDecimal> = with_nan_with_inf.non_nan()?; // NanError
let fd: Signed<UnsignedFixedDecimal> = with_inf.finite()?; // LimitError
// Unclear what the use case for this is
let with_nan: WithNan<Signed<UnsignedFixedDecimal> = with_nan_with_inf.finite()?; // LimitError
Possible type aliases:
@robertbastian Counter proposal:
|
Was not supporting NaN discussed? I think supporting Infinity makes some sense: it's hard to format -Inf without it, and Infinity is somewhat likely to come up in some cases. But NaN really just is an error and we should not be handling errors via formatting. And if we support Inf, userspace formatting for NaN (for 402) becomes relatively straightforward. |
This also works for me. I think what we have now is acceptable. The default format function shouldn't handle Inf IMO. |
I mostly agree, but I think the names need to be bikeshed more. |
Marking as priority, let's discuss this to the point that it stops being a 2.0 blocker. Since I can't make the call tomorrow, and also potentially not the WG meeting due to UTC, I'll state my principles here for the naming:
This gives me the types |
@Manishearth did you have an opinion on my proposal to keep the crate named fixed_decimal but remove "Fixed" from type names |
We're not including integers in this discussion, but we should. I think there are only a few compositions that are actually useful:
I did not include "infinity but not NaN" in the list above because I couldn't immediately identify a real use case. If you are dealing with non-finite values, you usually hit both Infinity (1/0) and NaN (0/0). The Rust convention and the convention in most other programming languages is that the signed thing is the default and unsigned has an adjective, as much as I dislike negated adjectives. So, the only real question is whether the finite or non-finite decimal should be the "default" one. The position I've held for years has been that This would imply naming such as the following:
As I did last week, I dropped "Fixed" from my names because the crate says it. However, if we think people are going to want to regularly import this into a project that has a |
Good question. I think calling it Decimal is fine, I think the suggested naming is fine, ignoring specific choices for the Infinite cases (which I think we should just bikeshed separately). |
Discussion: core decimal types
Proposal:
LGTM: @sffc @younies @robertbastian @echeran Issue to handle this part: |
Discussion on how to handle infinity formatting:
Architectural proposal (not bikeshedding): For 2.0:
LGTM: @sffc @Manishearth @younies @robertbastian @echeran Rough design for units post 2.0 (not binding):
Discussion for FixedDecimal vs UnsignedFixedDecimal vs SignedFixedDecimal
Proposal:
|
Rob to finish the renames |
Now, the Shall we rename the files too ? |
We can clean up file names later, that's not 2.0 blocking. I agree it would be nice to do that. |
# Description: Renames the UnsignedFixedDecimal type to UnsignedDecimal across multiple files in the fixed_decimal module. This includes updates to: - Rust source files - Documentation - Test files - Dart bindings - Diplomat coverage allowlist The rename maintains the existing functionality while providing a more concise type name. Related Issues: #5065, #6144 <!-- Thank you for your pull request to ICU4X! Reminder: try to use [Conventional Comments](https://conventionalcomments.org/) to make comments clearer. Please see https://github.com/unicode-org/icu4x/blob/main/CONTRIBUTING.md for general information on contributing to ICU4X. -->
We've landed both renames. This is done. |
In many application for units (especially, mixed units), There are a need to represent the numbers as Inf, Neg, Pos ... etc.
Examples:
Therefore, we need a way to represent Inf, Neg, Nan ... etc in FixedDecimal
Options for expressing mixed units:
Signed<T>
3 feet, -11 inches
The text was updated successfully, but these errors were encountered: