Skip to content

convert : qwen2/3moe : set yarn metadata if present#13331

Merged
CISC merged 2 commits intomasterfrom
convert-qwen2-3moe-yarn
May 6, 2025
Merged

convert : qwen2/3moe : set yarn metadata if present#13331
CISC merged 2 commits intomasterfrom
convert-qwen2-3moe-yarn

Conversation

@CISC
Copy link
Collaborator

@CISC CISC commented May 6, 2025

Set YaRN metadata if present on Qwen2/3MoE just like Qwen2/3.

@CISC CISC requested a review from ngxson May 6, 2025 06:59
@github-actions github-actions bot added the python python script changes label May 6, 2025
Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
@CISC CISC merged commit 764b856 into master May 6, 2025
7 checks passed
@CISC CISC deleted the convert-qwen2-3moe-yarn branch May 6, 2025 09:12
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request May 6, 2025
* origin/master: (27 commits)
llama : fix build_ffn without gate (ggml-org#13336)
CUDA: fix bad asserts for partial offload (ggml-org#13337)
convert : qwen2/3moe : set yarn metadata if present (ggml-org#13331)
CUDA: fix --split-mode row for MMQ (ggml-org#13323)
gguf-py : avoid requiring pyside6 for other scripts (ggml-org#13036)
CUDA: fix logic for clearing padding with -ngl 0 (ggml-org#13320)
sampling : Integrate Top-nσ into main sampling chain (and add it to the server) (ggml-org#13264)
server : Webui - change setText command from parent window to also send the message. (ggml-org#13309)
mtmd : rename llava directory to mtmd (ggml-org#13311)
clip : fix confused naming ffn_up and ffn_down (ggml-org#13290)
convert : bailingmoe : set yarn metadata if present (ggml-org#13312)
SYCL: Disable mul_mat kernels for noncontiguous tensor b (ggml-org#13308)
mtmd : add C public API (ggml-org#13184)
rpc : use backend registry, support dl backends (ggml-org#13304)
ggml : activate s390x simd for Q3_K (ggml-org#13301)
llava/mtmd : fixes to fully support dl backends (ggml-org#13303)
llama : build windows releases with dl backends (ggml-org#13220)
CUDA: fix race condition in MMQ stream-k fixup (ggml-org#13299)
CUDA: fix race condition in MMQ ids_dst (ggml-org#13294)
vulkan: Additional type support for unary, binary, and copy (ggml-org#13266)
...
timwu pushed a commit to timwu/llama.cpp that referenced this pull request Dec 20, 2025
* set yarn metadata if present

* add comment about enabling YaRN

Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>

---------

Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Support YaRN RoPE Scaling on Qwen2MoeModel/Qwen3MoeModel models on convert_hf_to_gguf.py

2 participants