-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Julia routines for setting/getting "subnormals are zero" mode. #12172
Conversation
This is a question out of curiosity rather than an issue for this PR: How is cpuid handled in a multi CPU setup? Is it possible that different CPU's support different features? (or is this why cpuid is serializing?) |
In multi-CPU setups, some of the CPUID results can differ, because they tell you which hardware thread you are on. But for features like FZ/DAZ, they need to be the same across hardware threads for just about any time-slicing scheduler, since otherwise the end of a time slice (and rescheduling onto another hardware thread) would invalidate the result in a racy way. I don't know why the designers made CPUID serializing in its introduction (reference: Section 18.8 of Pentium manual). It's regrettable since it's often needed when generating code that exploits version-specific features, and Pentium 4 introduced the |
I see. Thanks.
|
Thanks, Arch and Yu (...assuming your name here is in Western name order...) for digging this all back up and moving forward with a plan. While I regret not getting back to truly finish this up, I have no doubt that this is a more thorough, considered treatment than I could have given it. |
|
||
// Returns non-zero if subnormals go to 0; zero otherwise. | ||
DLLEXPORT uint32_t jl_get_zero_subnormals(int8_t isZero) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal but maybe it's better to be consistent with uint32_t
and int32_t
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the sharp eyes. Now fixed.
…ero" mode. This mode sets the FZ/DAZ features on x86 processors that support them. See issue JuliaLang#12132 for discussion.
9cd9a15
to
591e0a9
Compare
Documentation added to PR. |
The doc added looks great! Thanks. |
I have no idea what subnormal numbers are, but are there any reason why we shouldn't support: set_zero_subnormals() do
heatflow(a,1000)
end Apart from not thinking about the utility, or not bothering writing the code. |
@ivarne See #12132 (comment) and my reply below. I think mainly because this is not needed that often (and anonymous is slow). There's no reason we couldn't add that though. |
Add Julia routines for setting/getting "subnormals are zero" mode.
This PR lets users more easily elect to treat subnormal numbers as zeros, which speeds up some programs on some x86 processors. Previously, doing such required
ccall
. The patch also lets users check the current treatment. See issue #12132 for discussion. Here is a demo program, which also demonstrates the "inject some noise" trick for avoiding subnormals.The patch removes the undocumented routine
jl_zero_subnormals
and replaces it withjl_set_zero_subnormals
. I renamed it because the semantics changed to make it similar to the Linuxfesetround
interface.The implementation caches the result of CPUID inspection since CPUID is a serializing (slow) instruction.
I'll write the user documentation as a separate PR once we've agreed that this PR is the right way to go.