Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make the empty parens after level constructors optional #2750

Merged
merged 5 commits into from
Nov 8, 2024

Conversation

ericniebler
Copy link
Contributor

For example, what before needed to be written as:

cudax::make_hierarchy(cudax::block_dims<block_size>(), cudax::grid_dims<grid_size>())

can now be written as:

cudax::make_hierarchy(cudax::block_dims<block_size>, cudax::grid_dims<grid_size>)

This syntax cannot be made to work with operator&. For example, this will not compile:

cudax::block_dims<block_size> & cudax::grid_dims<grid_size> // compile error

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

For example, what before needed to be written as:

```cpp
cudax::make_hierarchy(cudax::block_dims<block_size>(), cudax::grid_dims<grid_size>())
```

can now be written as:

```cpp
cudax::make_hierarchy(cudax::block_dims<block_size>, cudax::grid_dims<grid_size>)
```
@ericniebler ericniebler requested a review from a team as a code owner November 8, 2024 19:47
@ericniebler ericniebler requested a review from a team as a code owner November 8, 2024 20:01
@ericniebler ericniebler requested a review from griwes November 8, 2024 20:01
Copy link
Contributor

github-actions bot commented Nov 8, 2024

🟩 CI finished in 53m 50s: Pass: 100%/54 | Total: 4h 04m | Avg: 4m 31s | Max: 18m 31s | Hits: 89%/238
  • 🟩 cudax: Pass: 100%/54 | Total: 4h 04m | Avg: 4m 31s | Max: 18m 31s | Hits: 89%/238

    🟩 cpu
      🟩 amd64              Pass: 100%/50  | Total:  3h 54m | Avg:  4m 41s | Max: 18m 31s | Hits:  89%/238   
      🟩 arm64              Pass: 100%/4   | Total: 10m 28s | Avg:  2m 37s | Max:  3m 06s
    🟩 ctk
      🟩 12.0               Pass: 100%/19  | Total:  1h 25m | Avg:  4m 30s | Max: 16m 07s | Hits:  89%/119   
      🟩 12.5               Pass: 100%/2   | Total:  9m 13s | Avg:  4m 36s | Max:  4m 41s
      🟩 12.6               Pass: 100%/33  | Total:  2h 29m | Avg:  4m 32s | Max: 18m 31s | Hits:  89%/119   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/19  | Total:  1h 25m | Avg:  4m 30s | Max: 16m 07s | Hits:  89%/119   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  9m 13s | Avg:  4m 36s | Max:  4m 41s
      🟩 nvcc12.6           Pass: 100%/33  | Total:  2h 29m | Avg:  4m 32s | Max: 18m 31s | Hits:  89%/119   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/54  | Total:  4h 04m | Avg:  4m 31s | Max: 18m 31s | Hits:  89%/238   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  6m 44s | Avg:  3m 22s | Max:  3m 33s
      🟩 Clang10            Pass: 100%/2   | Total:  6m 59s | Avg:  3m 29s | Max:  3m 49s
      🟩 Clang11            Pass: 100%/4   | Total: 11m 53s | Avg:  2m 58s | Max:  3m 13s
      🟩 Clang12            Pass: 100%/4   | Total: 11m 56s | Avg:  2m 59s | Max:  3m 11s
      🟩 Clang13            Pass: 100%/4   | Total: 11m 44s | Avg:  2m 56s | Max:  3m 13s
      🟩 Clang14            Pass: 100%/4   | Total: 25m 24s | Avg:  6m 21s | Max: 16m 07s
      🟩 Clang15            Pass: 100%/2   | Total:  6m 13s | Avg:  3m 06s | Max:  3m 08s
      🟩 Clang16            Pass: 100%/4   | Total: 11m 55s | Avg:  2m 58s | Max:  3m 12s
      🟩 Clang17            Pass: 100%/2   | Total:  6m 17s | Avg:  3m 08s | Max:  3m 13s
      🟩 Clang18            Pass: 100%/2   | Total: 21m 50s | Avg: 10m 55s | Max: 18m 31s
      🟩 GCC9               Pass: 100%/2   | Total:  5m 51s | Avg:  2m 55s | Max:  2m 59s
      🟩 GCC10              Pass: 100%/4   | Total: 12m 08s | Avg:  3m 02s | Max:  3m 17s
      🟩 GCC11              Pass: 100%/4   | Total: 12m 06s | Avg:  3m 01s | Max:  3m 12s
      🟩 GCC12              Pass: 100%/7   | Total:  1h 00m | Avg:  8m 37s | Max: 16m 51s
      🟩 GCC13              Pass: 100%/3   | Total:  7m 40s | Avg:  2m 33s | Max:  2m 47s
      🟩 MSVC14.36          Pass: 100%/1   | Total:  7m 35s | Avg:  7m 35s | Max:  7m 35s | Hits:  89%/119   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  8m 54s | Avg:  8m 54s | Max:  8m 54s | Hits:  89%/119   
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  9m 13s | Avg:  4m 36s | Max:  4m 41s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  2h 00m | Avg:  4m 01s | Max: 18m 31s
      🟩 GCC                Pass: 100%/20  | Total:  1h 38m | Avg:  4m 54s | Max: 16m 51s
      🟩 MSVC               Pass: 100%/2   | Total: 16m 29s | Avg:  8m 14s | Max:  8m 54s | Hits:  89%/238   
      🟩 NVHPC              Pass: 100%/2   | Total:  9m 13s | Avg:  4m 36s | Max:  4m 41s
    🟩 gpu
      🟩 v100               Pass: 100%/54  | Total:  4h 04m | Avg:  4m 31s | Max: 18m 31s | Hits:  89%/238   
    🟩 jobs
      🟩 Build              Pass: 100%/49  | Total:  2h 41m | Avg:  3m 17s | Max:  8m 54s | Hits:  89%/238   
      🟩 Test               Pass: 100%/5   | Total:  1h 23m | Avg: 16m 36s | Max: 18m 31s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  2m 44s | Avg:  2m 44s | Max:  2m 44s
      🟩 90a                Pass: 100%/1   | Total:  2m 47s | Avg:  2m 47s | Max:  2m 47s
    🟩 std
      🟩 17                 Pass: 100%/29  | Total:  1h 54m | Avg:  3m 57s | Max: 16m 00s
      🟩 20                 Pass: 100%/25  | Total:  2h 10m | Avg:  5m 12s | Max: 18m 31s | Hits:  89%/238   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 54)

# Runner
43 linux-amd64-cpu16
5 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

Copy link
Contributor

@pciolkosz pciolkosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into it

else

template <typename L1, typename... Levels>
_CCCL_NODISCARD _CUDAX_API constexpr auto operator()(const L1& l1, const Levels&... ls) const noexcept
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can now drop l1? It was needed previously for the can_stack, but now I think it's always used along with the ls pack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__can_stack wants to compare the shifted list of levels. we need to separate out L1 here so we can easily shift the levels.

Copy link
Contributor

@pciolkosz pciolkosz Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, its not as easy as I thought, I had to also change __can_stack to not be an alias, not sure if its worth it.
Feel free to revert my change here

// TODO accept forwarding references
template <typename LUnit, typename L1, typename... Levels>
_CCCL_HOST_DEVICE constexpr auto
operator&(const hierarchy_dimensions_fragment<LUnit, Levels...>& ls, const L1& l1) noexcept
_CUDAX_API constexpr auto operator&(const hierarchy_dimensions_fragment<LUnit, Levels...>& ls, L1 l1) noexcept
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: With the changes I noticed how bad L1/l1 name is here, its a copy paste from other places probably. Maybe new_level or similar would be better here?

@ericniebler ericniebler enabled auto-merge (squash) November 8, 2024 22:38
Copy link
Contributor

github-actions bot commented Nov 8, 2024

🟩 CI finished in 54m 34s: Pass: 100%/54 | Total: 4h 44m | Avg: 5m 15s | Max: 17m 25s | Hits: 63%/238
  • 🟩 cudax: Pass: 100%/54 | Total: 4h 44m | Avg: 5m 15s | Max: 17m 25s | Hits: 63%/238

    🟩 cpu
      🟩 amd64              Pass: 100%/50  | Total:  4h 30m | Avg:  5m 24s | Max: 17m 25s | Hits:  63%/238   
      🟩 arm64              Pass: 100%/4   | Total: 14m 01s | Avg:  3m 30s | Max:  3m 44s
    🟩 ctk
      🟩 12.0               Pass: 100%/19  | Total:  1h 42m | Avg:  5m 24s | Max: 17m 25s | Hits:  63%/119   
      🟩 12.5               Pass: 100%/2   | Total: 11m 42s | Avg:  5m 51s | Max:  5m 53s
      🟩 12.6               Pass: 100%/33  | Total:  2h 49m | Avg:  5m 08s | Max: 17m 05s | Hits:  63%/119   
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/19  | Total:  1h 42m | Avg:  5m 24s | Max: 17m 25s | Hits:  63%/119   
      🟩 nvcc12.5           Pass: 100%/2   | Total: 11m 42s | Avg:  5m 51s | Max:  5m 53s
      🟩 nvcc12.6           Pass: 100%/33  | Total:  2h 49m | Avg:  5m 08s | Max: 17m 05s | Hits:  63%/119   
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/54  | Total:  4h 44m | Avg:  5m 15s | Max: 17m 25s | Hits:  63%/238   
    🟩 cxx
      🟩 Clang9             Pass: 100%/2   | Total:  8m 06s | Avg:  4m 03s | Max:  4m 10s
      🟩 Clang10            Pass: 100%/2   | Total:  8m 13s | Avg:  4m 06s | Max:  4m 17s
      🟩 Clang11            Pass: 100%/4   | Total: 14m 52s | Avg:  3m 43s | Max:  3m 50s
      🟩 Clang12            Pass: 100%/4   | Total: 14m 41s | Avg:  3m 40s | Max:  4m 03s
      🟩 Clang13            Pass: 100%/4   | Total: 15m 00s | Avg:  3m 45s | Max:  4m 02s
      🟩 Clang14            Pass: 100%/4   | Total: 28m 34s | Avg:  7m 08s | Max: 17m 25s
      🟩 Clang15            Pass: 100%/2   | Total:  7m 54s | Avg:  3m 57s | Max:  4m 14s
      🟩 Clang16            Pass: 100%/4   | Total: 14m 30s | Avg:  3m 37s | Max:  3m 49s
      🟩 Clang17            Pass: 100%/2   | Total:  8m 03s | Avg:  4m 01s | Max:  4m 02s
      🟩 Clang18            Pass: 100%/2   | Total: 20m 45s | Avg: 10m 22s | Max: 16m 53s
      🟩 GCC9               Pass: 100%/2   | Total:  6m 53s | Avg:  3m 26s | Max:  3m 39s
      🟩 GCC10              Pass: 100%/4   | Total: 15m 02s | Avg:  3m 45s | Max:  3m 57s
      🟩 GCC11              Pass: 100%/4   | Total: 14m 43s | Avg:  3m 40s | Max:  3m 53s
      🟩 GCC12              Pass: 100%/7   | Total:  1h 06m | Avg:  9m 31s | Max: 17m 19s
      🟩 GCC13              Pass: 100%/3   | Total: 10m 17s | Avg:  3m 25s | Max:  3m 40s
      🟩 MSVC14.36          Pass: 100%/1   | Total: 10m 24s | Avg: 10m 24s | Max: 10m 24s | Hits:  63%/119   
      🟩 MSVC14.39          Pass: 100%/1   | Total:  7m 51s | Avg:  7m 51s | Max:  7m 51s | Hits:  63%/119   
      🟩 NVHPC24.7          Pass: 100%/2   | Total: 11m 42s | Avg:  5m 51s | Max:  5m 53s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/30  | Total:  2h 20m | Avg:  4m 41s | Max: 17m 25s
      🟩 GCC                Pass: 100%/20  | Total:  1h 53m | Avg:  5m 40s | Max: 17m 19s
      🟩 MSVC               Pass: 100%/2   | Total: 18m 15s | Avg:  9m 07s | Max: 10m 24s | Hits:  63%/238   
      🟩 NVHPC              Pass: 100%/2   | Total: 11m 42s | Avg:  5m 51s | Max:  5m 53s
    🟩 gpu
      🟩 v100               Pass: 100%/54  | Total:  4h 44m | Avg:  5m 15s | Max: 17m 25s | Hits:  63%/238   
    🟩 jobs
      🟩 Build              Pass: 100%/49  | Total:  3h 18m | Avg:  4m 03s | Max: 10m 24s | Hits:  63%/238   
      🟩 Test               Pass: 100%/5   | Total:  1h 25m | Avg: 17m 08s | Max: 17m 25s
    🟩 sm
      🟩 90                 Pass: 100%/1   | Total:  3m 12s | Avg:  3m 12s | Max:  3m 12s
      🟩 90a                Pass: 100%/1   | Total:  3m 15s | Avg:  3m 15s | Max:  3m 15s
    🟩 std
      🟩 17                 Pass: 100%/29  | Total:  2h 17m | Avg:  4m 44s | Max: 17m 19s
      🟩 20                 Pass: 100%/25  | Total:  2h 26m | Avg:  5m 52s | Max: 17m 25s | Hits:  63%/238   
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
CUB
Thrust
+/- CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

🏃‍ Runner counts (total jobs: 54)

# Runner
43 linux-amd64-cpu16
5 linux-amd64-gpu-v100-latest-1
4 linux-arm64-cpu16
2 windows-amd64-cpu16

@ericniebler ericniebler merged commit 9616009 into NVIDIA:main Nov 8, 2024
69 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants