Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify passing ValueT to scan_by_key tuning #3143

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

bernhardmgruber
Copy link
Contributor

@bernhardmgruber bernhardmgruber commented Dec 12, 2024

This PR clarifies that passing ValueT is to the tunings is correct.

See also discussion here: #3139 (comment)

@bernhardmgruber bernhardmgruber marked this pull request as ready for review December 12, 2024 17:21
@bernhardmgruber bernhardmgruber requested review from a team as code owners December 12, 2024 17:21
@NVIDIA NVIDIA deleted a comment from copy-pr-bot bot Dec 12, 2024
@bernhardmgruber bernhardmgruber enabled auto-merge (squash) December 12, 2024 17:49
Copy link
Contributor

🟩 CI finished in 1h 53m: Pass: 100%/94 | Total: 2d 13h | Avg: 39m 15s | Max: 1h 08m | Hits: 74%/12324
  • 🟩 thrust: Pass: 100%/46 | Total: 1d 00h | Avg: 31m 40s | Max: 59m 54s | Hits: 76%/9260

    🟩 cmake_options
      🟩 -DTHRUST_DISPATCH_TYPE=Force32bit Pass: 100%/2   | Total: 45m 49s | Avg: 22m 54s | Max: 29m 47s
    🟩 cpu
      🟩 amd64              Pass: 100%/44  | Total: 23h 13m | Avg: 31m 40s | Max: 59m 54s | Hits:  76%/9260  
      🟩 arm64              Pass: 100%/2   | Total:  1h 03m | Avg: 31m 34s | Max: 32m 24s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  3h 21m | Avg: 28m 45s | Max: 50m 55s | Hits:  71%/1852  
      🟩 12.5               Pass: 100%/2   | Total:  1h 45m | Avg: 52m 56s | Max: 55m 34s
      🟩 12.6               Pass: 100%/37  | Total: 19h 09m | Avg: 31m 04s | Max: 59m 54s | Hits:  78%/7408  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total: 55m 17s | Avg: 27m 38s | Max: 29m 16s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  3h 21m | Avg: 28m 45s | Max: 50m 55s | Hits:  71%/1852  
      🟩 nvcc12.5           Pass: 100%/2   | Total:  1h 45m | Avg: 52m 56s | Max: 55m 34s
      🟩 nvcc12.6           Pass: 100%/35  | Total: 18h 14m | Avg: 31m 16s | Max: 59m 54s | Hits:  78%/7408  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 55m 17s | Avg: 27m 38s | Max: 29m 16s
      🟩 nvcc               Pass: 100%/44  | Total: 23h 21m | Avg: 31m 51s | Max: 59m 54s | Hits:  76%/9260  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  1h 46m | Avg: 26m 32s | Max: 29m 42s
      🟩 Clang10            Pass: 100%/1   | Total: 32m 31s | Avg: 32m 31s | Max: 32m 31s
      🟩 Clang11            Pass: 100%/1   | Total: 30m 45s | Avg: 30m 45s | Max: 30m 45s
      🟩 Clang12            Pass: 100%/1   | Total: 30m 57s | Avg: 30m 57s | Max: 30m 57s
      🟩 Clang13            Pass: 100%/1   | Total: 33m 13s | Avg: 33m 13s | Max: 33m 13s
      🟩 Clang14            Pass: 100%/1   | Total: 33m 45s | Avg: 33m 45s | Max: 33m 45s
      🟩 Clang15            Pass: 100%/1   | Total: 30m 41s | Avg: 30m 41s | Max: 30m 41s
      🟩 Clang16            Pass: 100%/1   | Total: 33m 52s | Avg: 33m 52s | Max: 33m 52s
      🟩 Clang17            Pass: 100%/1   | Total: 29m 59s | Avg: 29m 59s | Max: 29m 59s
      🟩 Clang18            Pass: 100%/7   | Total:  2h 52m | Avg: 24m 34s | Max: 32m 29s
      🟩 GCC6               Pass: 100%/2   | Total: 51m 06s | Avg: 25m 33s | Max: 28m 51s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 01m | Avg: 30m 41s | Max: 32m 41s
      🟩 GCC8               Pass: 100%/1   | Total: 31m 28s | Avg: 31m 28s | Max: 31m 28s
      🟩 GCC9               Pass: 100%/3   | Total:  1h 25m | Avg: 28m 29s | Max: 36m 26s
      🟩 GCC10              Pass: 100%/1   | Total: 33m 02s | Avg: 33m 02s | Max: 33m 02s
      🟩 GCC11              Pass: 100%/1   | Total: 43m 36s | Avg: 43m 36s | Max: 43m 36s
      🟩 GCC12              Pass: 100%/1   | Total: 33m 23s | Avg: 33m 23s | Max: 33m 23s
      🟩 GCC13              Pass: 100%/8   | Total:  3h 11m | Avg: 23m 58s | Max: 35m 29s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 43m 09s | Avg: 43m 09s | Max: 43m 09s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 50m 55s | Avg: 50m 55s | Max: 50m 55s | Hits:  71%/1852  
      🟩 MSVC14.29          Pass: 100%/1   | Total: 59m 54s | Avg: 59m 54s | Max: 59m 54s | Hits:  71%/1852  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 11m | Avg: 43m 57s | Max: 57m 43s | Hits:  80%/5556  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  1h 45m | Avg: 52m 56s | Max: 55m 34s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total:  8h 53m | Avg: 28m 06s | Max: 33m 52s
      🟩 GCC                Pass: 100%/19  | Total:  8h 51m | Avg: 27m 57s | Max: 43m 36s
      🟩 Intel              Pass: 100%/1   | Total: 43m 09s | Avg: 43m 09s | Max: 43m 09s
      🟩 MSVC               Pass: 100%/5   | Total:  4h 02m | Avg: 48m 32s | Max: 59m 54s | Hits:  76%/9260  
      🟩 NVHPC              Pass: 100%/2   | Total:  1h 45m | Avg: 52m 56s | Max: 55m 34s
    🟩 gpu
      🟩 v100               Pass: 100%/46  | Total:  1d 00h | Avg: 31m 40s | Max: 59m 54s | Hits:  76%/9260  
    🟩 jobs
      🟩 Build              Pass: 100%/40  | Total: 22h 47m | Avg: 34m 11s | Max: 59m 54s | Hits:  71%/7408  
      🟩 TestCPU            Pass: 100%/3   | Total: 37m 20s | Avg: 12m 26s | Max: 22m 32s | Hits:  99%/1852  
      🟩 TestGPU            Pass: 100%/3   | Total: 51m 54s | Avg: 17m 18s | Max: 18m 50s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 19m 45s | Avg: 19m 45s | Max: 19m 45s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  1h 59m | Avg: 23m 55s | Max: 28m 42s
      🟩 14                 Pass: 100%/4   | Total:  2h 21m | Avg: 35m 22s | Max: 50m 55s | Hits:  71%/1852  
      🟩 17                 Pass: 100%/12  | Total:  7h 33m | Avg: 37m 46s | Max: 59m 54s | Hits:  71%/3704  
      🟩 20                 Pass: 100%/23  | Total: 11h 36m | Avg: 30m 17s | Max: 57m 43s | Hits:  85%/3704  
    
  • 🟩 cub: Pass: 100%/45 | Total: 1d 12h | Avg: 48m 40s | Max: 1h 08m | Hits: 65%/3064

    🟩 cpu
      🟩 amd64              Pass: 100%/43  | Total:  1d 10h | Avg: 48m 20s | Max:  1h 08m | Hits:  65%/3064  
      🟩 arm64              Pass: 100%/2   | Total:  1h 52m | Avg: 56m 02s | Max: 56m 05s
    🟩 ctk
      🟩 11.1               Pass: 100%/7   | Total:  5h 27m | Avg: 46m 48s | Max: 57m 14s | Hits:  65%/766   
      🟩 12.5               Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 08m
      🟩 12.6               Pass: 100%/36  | Total:  1d 04h | Avg: 48m 08s | Max:  1h 05m | Hits:  65%/2298  
    🟩 cudacxx
      🟩 ClangCUDA18        Pass: 100%/2   | Total:  1h 54m | Avg: 57m 14s | Max: 57m 38s
      🟩 nvcc11.1           Pass: 100%/7   | Total:  5h 27m | Avg: 46m 48s | Max: 57m 14s | Hits:  65%/766   
      🟩 nvcc12.5           Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 08m
      🟩 nvcc12.6           Pass: 100%/34  | Total:  1d 02h | Avg: 47m 36s | Max:  1h 05m | Hits:  65%/2298  
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total:  1h 54m | Avg: 57m 14s | Max: 57m 38s
      🟩 nvcc               Pass: 100%/43  | Total:  1d 10h | Avg: 48m 16s | Max:  1h 08m | Hits:  65%/3064  
    🟩 cxx
      🟩 Clang9             Pass: 100%/4   | Total:  3h 12m | Avg: 48m 01s | Max: 51m 38s
      🟩 Clang10            Pass: 100%/1   | Total: 52m 35s | Avg: 52m 35s | Max: 52m 35s
      🟩 Clang11            Pass: 100%/1   | Total: 53m 42s | Avg: 53m 42s | Max: 53m 42s
      🟩 Clang12            Pass: 100%/1   | Total: 50m 57s | Avg: 50m 57s | Max: 50m 57s
      🟩 Clang13            Pass: 100%/1   | Total: 50m 40s | Avg: 50m 40s | Max: 50m 40s
      🟩 Clang14            Pass: 100%/1   | Total: 58m 00s | Avg: 58m 00s | Max: 58m 00s
      🟩 Clang15            Pass: 100%/1   | Total: 51m 15s | Avg: 51m 15s | Max: 51m 15s
      🟩 Clang16            Pass: 100%/1   | Total: 53m 04s | Avg: 53m 04s | Max: 53m 04s
      🟩 Clang17            Pass: 100%/1   | Total: 51m 08s | Avg: 51m 08s | Max: 51m 08s
      🟩 Clang18            Pass: 100%/7   | Total:  5h 32m | Avg: 47m 32s | Max: 57m 38s
      🟩 GCC6               Pass: 100%/2   | Total:  1h 32m | Avg: 46m 05s | Max: 46m 20s
      🟩 GCC7               Pass: 100%/2   | Total:  1h 45m | Avg: 52m 39s | Max: 54m 00s
      🟩 GCC8               Pass: 100%/1   | Total: 54m 14s | Avg: 54m 14s | Max: 54m 14s
      🟩 GCC9               Pass: 100%/3   | Total:  2h 20m | Avg: 46m 41s | Max: 50m 54s
      🟩 GCC10              Pass: 100%/1   | Total: 50m 55s | Avg: 50m 55s | Max: 50m 55s
      🟩 GCC11              Pass: 100%/1   | Total: 53m 04s | Avg: 53m 04s | Max: 53m 04s
      🟩 GCC12              Pass: 100%/1   | Total: 52m 09s | Avg: 52m 09s | Max: 52m 09s
      🟩 GCC13              Pass: 100%/8   | Total:  4h 25m | Avg: 33m 12s | Max: 56m 00s
      🟩 Intel2023.2.0      Pass: 100%/1   | Total: 55m 34s | Avg: 55m 34s | Max: 55m 34s
      🟩 MSVC14.16          Pass: 100%/1   | Total: 57m 14s | Avg: 57m 14s | Max: 57m 14s | Hits:  65%/766   
      🟩 MSVC14.29          Pass: 100%/1   | Total:  1h 04m | Avg:  1h 04m | Max:  1h 04m | Hits:  65%/766   
      🟩 MSVC14.39          Pass: 100%/2   | Total:  2h 03m | Avg:  1h 01m | Max:  1h 05m | Hits:  65%/1532  
      🟩 NVHPC24.7          Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 08m
    🟩 cxx_family
      🟩 Clang              Pass: 100%/19  | Total: 15h 46m | Avg: 49m 47s | Max: 58m 00s
      🟩 GCC                Pass: 100%/19  | Total: 13h 33m | Avg: 42m 49s | Max: 56m 00s
      🟩 Intel              Pass: 100%/1   | Total: 55m 34s | Avg: 55m 34s | Max: 55m 34s
      🟩 MSVC               Pass: 100%/4   | Total:  4h 05m | Avg:  1h 01m | Max:  1h 05m | Hits:  65%/3064  
      🟩 NVHPC              Pass: 100%/2   | Total:  2h 09m | Avg:  1h 04m | Max:  1h 08m
    🟩 gpu
      🟩 v100               Pass: 100%/45  | Total:  1d 12h | Avg: 48m 40s | Max:  1h 08m | Hits:  65%/3064  
    🟩 jobs
      🟩 Build              Pass: 100%/39  | Total:  1d 10h | Avg: 52m 42s | Max:  1h 08m | Hits:  65%/3064  
      🟩 DeviceLaunch       Pass: 100%/1   | Total: 18m 50s | Avg: 18m 50s | Max: 18m 50s
      🟩 GraphCapture       Pass: 100%/1   | Total: 16m 11s | Avg: 16m 11s | Max: 16m 11s
      🟩 HostLaunch         Pass: 100%/2   | Total: 56m 21s | Avg: 28m 10s | Max: 35m 50s
      🟩 TestGPU            Pass: 100%/2   | Total: 43m 46s | Avg: 21m 53s | Max: 22m 14s
    🟩 sm
      🟩 90a                Pass: 100%/1   | Total: 22m 38s | Avg: 22m 38s | Max: 22m 38s
    🟩 std
      🟩 11                 Pass: 100%/5   | Total:  3h 59m | Avg: 47m 49s | Max: 51m 38s
      🟩 14                 Pass: 100%/4   | Total:  3h 28m | Avg: 52m 07s | Max: 57m 14s | Hits:  65%/766   
      🟩 17                 Pass: 100%/12  | Total: 10h 49m | Avg: 54m 08s | Max:  1h 04m | Hits:  65%/1532  
      🟩 20                 Pass: 100%/24  | Total: 18h 13m | Avg: 45m 32s | Max:  1h 08m | Hits:  65%/766   
    
  • 🟩 cccl_c_parallel: Pass: 100%/2 | Total: 9m 00s | Avg: 4m 30s | Max: 6m 55s

    🟩 cpu
      🟩 amd64              Pass: 100%/2   | Total:  9m 00s | Avg:  4m 30s | Max:  6m 55s
    🟩 ctk
      🟩 12.6               Pass: 100%/2   | Total:  9m 00s | Avg:  4m 30s | Max:  6m 55s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/2   | Total:  9m 00s | Avg:  4m 30s | Max:  6m 55s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/2   | Total:  9m 00s | Avg:  4m 30s | Max:  6m 55s
    🟩 cxx
      🟩 GCC13              Pass: 100%/2   | Total:  9m 00s | Avg:  4m 30s | Max:  6m 55s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/2   | Total:  9m 00s | Avg:  4m 30s | Max:  6m 55s
    🟩 gpu
      🟩 v100               Pass: 100%/2   | Total:  9m 00s | Avg:  4m 30s | Max:  6m 55s
    🟩 jobs
      🟩 Build              Pass: 100%/1   | Total:  2m 05s | Avg:  2m 05s | Max:  2m 05s
      🟩 Test               Pass: 100%/1   | Total:  6m 55s | Avg:  6m 55s | Max:  6m 55s
    
  • 🟩 python: Pass: 100%/1 | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s
    🟩 ctk
      🟩 12.6               Pass: 100%/1   | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s
    🟩 cudacxx
      🟩 nvcc12.6           Pass: 100%/1   | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 33m 59s | Avg: 33m 59s | Max: 33m 59s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
Thrust
CUDA Experimental
python
CCCL C Parallel Library
Catch2Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- python
+/- CCCL C Parallel Library
+/- Catch2Helper

🏃‍ Runner counts (total jobs: 94)

# Runner
70 linux-amd64-cpu16
11 linux-amd64-gpu-v100-latest-1
9 windows-amd64-cpu16
4 linux-arm64-cpu16

@bernhardmgruber bernhardmgruber merged commit 650cbad into NVIDIA:main Dec 12, 2024
109 checks passed
@bernhardmgruber bernhardmgruber deleted the clarify_accumt branch December 12, 2024 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants