-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A way to pre-build kernels for several architectures #1373
Comments
The pre-compiled kernels are just the ordinary kernels, and all their sources already provided in this repo. You can run MIOpen on whatever card you want and then collect compiled kernels from the binary cache directory. |
@atamazov Indeed the users can generate compiled kernels on any cards they process. However, the GPU detection check at the moment has blocked them from building it first. Can we move "GPU DETECTION FAILED DURING CMAKE PHASE" to a warning instead? |
It should be warning already. Error will happen only if you run "make check" |
Thanks. Also, I'd like to know, whether there is a way to build all these kernels to .kdb for several architectures, even if there is no available GPU on the build machine? I suppose there is a build tool in ROCm team to generate the binary cache hosted on https://repo.radeon.com/rocm/miopen-kernel/, and I'd like to compile these before during the build of MIOpen. |
Yes, but for each arch you'll get separate .ukdb file. I think that @JehandadKhan can provide you with the details on how to properly do that (as soon as time permits). |
I'm also trying to package this for nixpkgs, and would like to know how to properly generate the kdb files for all/specified architectures. |
@Madouura IIRC you can do this in straightforward way:
/cc @JehandadKhan |
Thank you for the prompt response. I'm a bit sick ATM so I may be misunderstanding here.
Does this mean having the actual GPU hardware, or just a setting? I own two AMD GPUs so I can at least generate for RX 6900XT and RX 6800 if it's the latter.
Given that, I'm thinking we need GPU hardware on the development system. Something is concerning me however, by "running the neural networks" am I wrong in thinking that implies that the database generated is different depending on the neural networks ran? Can I also safely assume that user kernel databases are safely interchangeable with (system) kernel databases? (I don't see why they wouldn't be, but best to make sure.) For the .ukdb (specifically, gfx1030_40) generated on my user's |
The former. It is also possible without hardware, but not easy AFAIK. Maybe @JehandadKhan can help you with it, when/if he has time.
Different networks use different primitives with different configurations, so MIOpen generates different kernels to implement them.
Yes IIRC
Yes. |
Yes, if possible that would be ideal since we could stick MIOpen on a (hopefully hydra or hydra-adjacent) server, which will likely be GPU-less and generate new KDBs per consumer/pro GPU release, as well as each major MIOpen change.
Okay, so in that case, if it's not proprietary and you and/or the team can share it, would you mind sharing the list (or a subset of the list depending on proprietariness) of neural networks the MIOpen team uses to generate the kernel databases? |
One more thing to make sure of. |
Are kernel databases just a collection of compute shaders? |
Yes. |
@Madouura Please provide a list of GPUs you need the pre-built kernels for. Maybe it would be possible to extend the list of "officially" supported GPUs with these. @JehandadKhan Is it possible to provide an instruction that would allow the end users to generate the pre-compiled kernel packages for their GPUs without actual hardware? /cc @junliume |
Sure. Looking at https://llvm.org/docs/AMDGPUUsage.html#processors, these should suffice. NeedRDNA1 and up should be officially supported IIRC by most ROCm projects.
WantIIRC these are not officially supported by most ROCM projects, so it's fine. Just wanted to list it to match up with rocBLAS'
I got these values by finding the compute units of each architecture's release GPU(s), so there may be some errors given what I'm seeing with the GFX10.3 series. |
In MIOpen we use the number of Hardware Compute Units which is twice less than rocminfo reports for gfx103X GPUs. |
In that case, it makes sense to move |
@littlewu2508 and @Madouura This request is similar to #2955 . Can you refer to the comments in that ticket? Thanks. |
Thanks, the comments there seems helpful, I will try it out and continue technical discussion there. |
closed as discussion on #2955 |
Hi, I'm maintainer of gentoo miopen package. I'd like to know if you can also provide the source code of the pre-compiled kernels, so that we can package it and users can compile them for their specific card (the pre-compiled kernel only contains few archs, which leads to problems such as #1309).
The text was updated successfully, but these errors were encountered: