-
Inferless
- SFO
- https://inferless.com/
- in/nilesh-agarwal
Pinned Loading
-
kserve
kserve PublicForked from kserve/kserve
Standardized Serverless ML Inference Platform on Kubernetes
Python
-
nvidia-triton-llm-streaming
nvidia-triton-llm-streaming PublicIntegrating SSE with NVIDIA Triton Inference Server using a Python backend and Zephyr model. There is very less documentation how to use Nvidia Triton in Streaming use-cases ( hard to find in their…
Python 10
-
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Qwen-32B Public templateForked from inferless/deepseek-r1-distill-qwen-32b
DeepSeek-R1-Distill-Qwen-32B is a distilled variant within the DeepSeek-R1 series. The dataset used for training is meticulously curated from the DeepSeek-R1 model, with Qwen2.5-32B serving as the …
Python
-
inferless/triton-co-pilot
inferless/triton-co-pilot PublicGenerate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments
-
inferless/whisper-large-v3
inferless/whisper-large-v3 Public templateState‑of‑the‑art speech recognition model for English, delivering transcription accuracy across diverse audio scenarios. <metadata> gpu: T4 | collections: ["CTranslate2"] </metadata>
If the problem persists, check the GitHub status page or contact support.