Popular repositories Loading
-
Triton-XDNA
Triton-XDNA PublicForked from amd/Triton-XDNA
Triton-XDNA with native Windows support - NPU kernel compilation & LLM inference on AMD Ryzen AI (Strix Halo)
Python 3
-
qwen-asr-rocm
qwen-asr-rocm PublicQwen3-ASR-0.6B speech-to-text service with vLLM, Flash Attention 2 (AMD triton), and Wyoming STT proxy for Home Assistant
Python 2
-
FastFlowLM-Docker
FastFlowLM-Docker PublicWyoming Protocol Docker container for FastFlowLM on AMD Ryzen AI NPUs — Whisper ASR + LLM conversation
Shell 1
-
bitsandbytes
bitsandbytes PublicForked from bitsandbytes-foundation/bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
Python
-
flash-attention
flash-attention PublicForked from Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Python
-
llm_assistant
llm_assistant PublicHome Assistant custom integration: OpenAI-compatible LLM conversation agent with MCP server support
Python
If the problem persists, check the GitHub status page or contact support.