cli: match llama.cpp GPU flag conventions (--ngl, --device) and align flag names#144
cli: match llama.cpp GPU flag conventions (--ngl, --device) and align flag names#144pekkah wants to merge 4 commits into
Conversation
… names - Rename the GPU-layers short flag -g to -ngl (matches llama.cpp); also accept --gpu-layers and --ngl long forms. Add -dev/--device for single-GPU device selection (index, CUDAn/Vulkann name, or 'none'); CUDA is pinned via CUDA_VISIBLE_DEVICES (robust across worker threads), Vulkan via a new VulkanBackend(deviceIndex) physical-device selector. - Rename --rep-penalty to --repeat-penalty and --draft-model to --model-draft (keeping --draft-model as an alias). - Spectre.Console.Cli forbids multi-char single-dash options, so add an argv shim in Program.cs translating llama's -ngl/-md/-st/-sys/-dev spellings to the registered long options; -st/-sys now reach --single-turn/--system-prompt. - Update user-facing strings, CLI/root README, CLAUDE.md, the ToolCall sample, and bench-129-ab.ps1 to the new flags. Backwards compatibility intentionally dropped per request.
Per review: no custom arg parsing. Register the llama.cpp names directly as long options (--ngl, --device, --model-draft, --repeat-penalty) and remove the Program.cs flag-translation shim. Single-dash llama spellings (-ngl, -dev, …) are not accepted; their double-dash equivalents are. Updated docs, the ToolCall sample parser/examples, and bench-129-ab.ps1 accordingly.
There was a problem hiding this comment.
Code Review
This pull request updates the CLI flags to match llama.cpp by renaming the GPU layers option from -g to --ngl / --n-gpu-layers and introducing a new --device option to target a specific GPU. It also updates the Vulkan backend to support explicit device selection. The review comments correctly identify two issues: first, in RunFlux, the deviceNone flag is discarded, which prevents --device none from disabling the GPU upscaler; second, the usage help string in the tool-call sample still references the deprecated -g flag.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…ma-cpp-Z9boe # Conflicts: # README.md
- RunFlux now captures deviceNone and keeps the RRDBNet upscaler on CPU when --device none/cpu is requested, instead of auto-selecting a GPU. - ToolCall sample usage string: -g <layers> -> --ngl <layers>. Addresses review feedback on PR #144.
|
Closing in favor of a fresh reimplementation on current |
What & why
Aligns the CLI flag names with
llama.cpp/llama-cli. Previously-gwas the short flag for--n-gpu-layers, which collides with llama.cpp's-ngl. Backwards compatibility was intentionally dropped per request.Flag changes (text generation + image commands)
-ngl/--n-gpu-layers-g--ngl/--n-gpu-layers/--gpu-layers--device--device(functional, single-GPU)--repeat-penalty--rep-penalty--repeat-penalty--model-draft--draft-model--model-draft(+--draft-modelalias)Note: Spectre.Console.Cli requires single-dash options to be one character, so llama's single-dash multi-char spellings (
-ngl,-dev, …) are exposed as double-dash long options (--ngl,--device). No custom argument parsing was added.Device selection (
--device, new)Accepts a single device: index (
0,1), name (CUDA0,Vulkan1), ornone(CPU). Single-device only (no multi-GPU split).CUDA_VISIBLE_DEVICESset before first CUDA init — robust across the prefetch worker threads (a per-threadcudaSetDevicewould not have been).VulkanBackend(int deviceIndex)physical-device selector with bounds + compute-queue validation.Also updated
User-facing console/error strings that referenced
-g, plus the CLI README, root README benchmark table,CLAUDE.md, the ToolCall sample (its own parser + docs), andbench-129-ab.ps1.Testing
dotnet build/dotnet test. Changes are verified by inspection only. Please run a localdotnet build -c Release && dotnet testbefore merging.https://claude.ai/code/session_01RvSxRhAddVVMd4DGMkvV4d
Generated by Claude Code