Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NBabel benchmark - Ecological Impact of HPC ... #1669

Open
paugier opened this issue Nov 20, 2020 · 7 comments
Open

NBabel benchmark - Ecological Impact of HPC ... #1669

paugier opened this issue Nov 20, 2020 · 7 comments

Comments

@paugier
Copy link
Contributor

paugier commented Nov 20, 2020

Did you read this paper published in Nature: The Ecological Impact of High-performance Computing in Astrophysics ?

One of the few figures show that Python-Numpy is very inefficient for a particular problem (N-Body). The code used should be available here https://www.nbabel.org/.

I played a bit on this problem (https://github.com/paugier/nbabel) since it would be interesting to show what can be done with Pythran on this case (published in Nature!) and some people in my lab work on the subject of the impact of computing on environment (@CyrilleBonamy).

The Pythran code is in this file https://github.com/paugier/nbabel/blob/master/py/bench.py Do you see some possible improvements in this code?

On my machine, the speedup compared to Python without acceleration is huge (~ x180) but it is still ~ 4 times slower than the Fortran and C++ implementations (which are not that crazy). Also, the 3 implementations do not give the same results 🙂 (the C++ implementation seems to be less accurate) and I don't understand why.

Moreover, I don't understand why -DUSE_XSIMD does not seem to give any speed up. Is it expected?

@serge-sans-paille
Copy link
Owner

The description you give sounds super interesting, and I 100% agree with the goal. I'll try to have a look at your implementation soon :-)

@paugier
Copy link
Contributor Author

paugier commented Nov 21, 2020

Good news. I added an implementation of the most important function with loops (paugier/nbabel#1). It's now faster than the C++ and Fortran implementations.

The elapsed times in second:

Py C++ Fortran
29 67 51

Note that there are no tests for https://www.nbabel.org/ problem, so I'm not sure about the correctness.

I will try to parallelize with OpenMP.

@CyrilleBonamy, it would be interesting to measure the energy consumption for this algorithm too and to replot the Nature figure. I wonder if some people have thought about submitting a reply to this article.

@serge-sans-paille
Copy link
Owner

That's great news! I commented in the PR to rip some extra perf but indeed, one need to be cautious with the actual result ;-)

@serge-sans-paille
Copy link
Owner

BTW, you should compile fortran and C++ code with -Ofast too to get a fair comparison ;-)

@serge-sans-paille
Copy link
Owner

When doing so, I get Fortran down to 39 which is still not as good as Pythran, but significantly closer.

@serge-sans-paille
Copy link
Owner

@paugier: #1671 should make it possible to use

def compute_distance(vec):

    return sqrt(np.sum(vec ** 2)

instead of

def compute_distance(vec):
    tmp = 0.0
    for i in range(3):
        tmp += vec[i] ** 2
    return sqrt(tmp)

without much performance impact :-)

@paugier
Copy link
Contributor Author

paugier commented Nov 25, 2020

I confirm! Really nice!

Other thing about this code: Pythran gives wrong results (no modification of the arrays when doing inplace assignation) in two cases.

  1. without .copy() in https://github.com/paugier/nbabel/blob/master/py/bench.py#L18
  2. with float[:,3] instead of float[:,:]

A bad thing it that in both cases, it runs without error but only the results are completely wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants