You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am very new with triton, and have started with the triton website's tutorial. During my study there was a big confusion with some api. I already checked the official document (https://triton-lang.org/main/python-api/triton.language.html), but it wasn't enough for me.
This is a very simple block making me with the confusion of arange.
Because there is only 1 block with 1 warp (32 threads), the device_print works 32 times, and each prints shows an integer number 0~7, repeating 4 times.
However, according to the doc, arange is introduced like this:
Returns contiguous values within the half-open interval [start, end)
So this makes me a question, why is the device print shows me just one integer, not the whole array 0~7? I expect the output as
During my code reading, I bumped into this code line:
ram = tl.max_contiguous(tl.multiple_of(offset_m % M, BLOCK_M), BLOCK_M)
This code line seems a bit popular, since I found it on the pytorch library torch.mm() api, or on some other matrix multipling api. But this is quite confusing either. If I check the doc about these two,
My question is, what is the return type of each of them?
For multiple_of, does "check whether input are all multiple of values" means that it returns True when all of them are multiples and False when any of them are not? In that case, max_contiguous needs a boolean type parameter input?
Why do we need to make the complier know the first value is contiguous?
Why does the "ram" variable shows like an integer when I use
tl.device_print("ram ", ram)
By the way, when I check the output of the device_print("ram", ram), it looks exactly the same with the results up at the question about arange()
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi there!
I am very new with triton, and have started with the triton website's tutorial. During my study there was a big confusion with some api. I already checked the official document (https://triton-lang.org/main/python-api/triton.language.html), but it wasn't enough for me.
1. arange()
The results are like this:
This is a very simple block making me with the confusion of arange.
Because there is only 1 block with 1 warp (32 threads), the device_print works 32 times, and each prints shows an integer number 0~7, repeating 4 times.
However, according to the doc, arange is introduced like this:
Returns contiguous values within the half-open interval [start, end)
So this makes me a question, why is the device print shows me just one integer, not the whole array 0~7? I expect the output as
, not just 1 integer in 1 device_print call.
Is there something I'm missing?
2. max_contiguous && multiple_of
During my code reading, I bumped into this code line:
ram = tl.max_contiguous(tl.multiple_of(offset_m % M, BLOCK_M), BLOCK_M)
This code line seems a bit popular, since I found it on the pytorch library torch.mm() api, or on some other matrix multipling api. But this is quite confusing either. If I check the doc about these two,
My question is, what is the return type of each of them?
By the way, when I check the output of the device_print("ram", ram), it looks exactly the same with the results up at the question about arange()
Thanks for reading.
Beta Was this translation helpful? Give feedback.
All reactions