cli: match llama.cpp GPU flag conventions (--ngl, --device) and align flag names by pekkah · Pull Request #144 · pekkah/SharpInference

pekkah · 2026-06-06T06:32:51Z

What & why

Aligns the CLI flag names with llama.cpp / llama-cli. Previously -g was the short flag for --n-gpu-layers, which collides with llama.cpp's -ngl. Backwards compatibility was intentionally dropped per request.

Flag changes (text generation + image commands)

Concept	llama.cpp	Before	Now
GPU layers	`-ngl` / `--n-gpu-layers`	`-g`	`--ngl` / `--n-gpu-layers` / `--gpu-layers`
Device select	`--device`	(none)	`--device` (functional, single-GPU)
Repetition penalty	`--repeat-penalty`	`--rep-penalty`	`--repeat-penalty`
Draft model	`--model-draft`	`--draft-model`	`--model-draft` (+ `--draft-model` alias)

Note: Spectre.Console.Cli requires single-dash options to be one character, so llama's single-dash multi-char spellings (-ngl, -dev, …) are exposed as double-dash long options (--ngl, --device). No custom argument parsing was added.

Device selection (`--device`, new)

Accepts a single device: index (0, 1), name (CUDA0, Vulkan1), or none (CPU). Single-device only (no multi-GPU split).

CUDA: pinned via CUDA_VISIBLE_DEVICES set before first CUDA init — robust across the prefetch worker threads (a per-thread cudaSetDevice would not have been).
Vulkan: new VulkanBackend(int deviceIndex) physical-device selector with bounds + compute-queue validation.

Also updated

User-facing console/error strings that referenced -g, plus the CLI README, root README benchmark table, CLAUDE.md, the ToolCall sample (its own parser + docs), and bench-129-ab.ps1.

Testing

⚠️ Not compiled. The session environment's network policy blocks the .NET SDK CDN (all Microsoft download hosts return HTTP 403) and only .NET 8 is available via apt, so I could not run dotnet build / dotnet test. Changes are verified by inspection only. Please run a local dotnet build -c Release && dotnet test before merging.

https://claude.ai/code/session_01RvSxRhAddVVMd4DGMkvV4d

Generated by Claude Code

… names - Rename the GPU-layers short flag -g to -ngl (matches llama.cpp); also accept --gpu-layers and --ngl long forms. Add -dev/--device for single-GPU device selection (index, CUDAn/Vulkann name, or 'none'); CUDA is pinned via CUDA_VISIBLE_DEVICES (robust across worker threads), Vulkan via a new VulkanBackend(deviceIndex) physical-device selector. - Rename --rep-penalty to --repeat-penalty and --draft-model to --model-draft (keeping --draft-model as an alias). - Spectre.Console.Cli forbids multi-char single-dash options, so add an argv shim in Program.cs translating llama's -ngl/-md/-st/-sys/-dev spellings to the registered long options; -st/-sys now reach --single-turn/--system-prompt. - Update user-facing strings, CLI/root README, CLAUDE.md, the ToolCall sample, and bench-129-ab.ps1 to the new flags. Backwards compatibility intentionally dropped per request.

Per review: no custom arg parsing. Register the llama.cpp names directly as long options (--ngl, --device, --model-draft, --repeat-penalty) and remove the Program.cs flag-translation shim. Single-dash llama spellings (-ngl, -dev, …) are not accepted; their double-dash equivalents are. Updated docs, the ToolCall sample parser/examples, and bench-129-ab.ps1 accordingly.

gemini-code-assist

Code Review

This pull request updates the CLI flags to match llama.cpp by renaming the GPU layers option from -g to --ngl / --n-gpu-layers and introducing a new --device option to target a specific GPU. It also updates the Vulkan backend to support explicit device selection. The review comments correctly identify two issues: first, in RunFlux, the deviceNone flag is discarded, which prevents --device none from disabling the GPU upscaler; second, the usage help string in the tool-call sample still references the deprecated -g flag.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

…ma-cpp-Z9boe # Conflicts: # README.md

- RunFlux now captures deviceNone and keeps the RRDBNet upscaler on CPU when --device none/cpu is requested, instead of auto-selecting a GPU. - ToolCall sample usage string: -g <layers> -> --ngl <layers>. Addresses review feedback on PR #144.

pekkah · 2026-06-14T08:55:58Z

Closing in favor of a fresh reimplementation on current master. This branch is 8 days behind and conflicts (DIRTY) across README.md, RunCommand.cs (now collides with the #233 -f/--file work), CudaHybridForwardPass.cs / VulkanBackend.cs (rewritten by the #215/#235/#238 perf arc), and CLAUDE.md — a rebase would be an error-prone 3-way merge through hot files. The valuable core (GpuDevice.cs --device parser + the VulkanBackend device-index selection) carries over verbatim; the reimplementation will KEEP -g and ADD the llama.cpp aliases (--ngl/--device/--model-draft) so it's non-breaking, avoiding the -g→--ngl example churn this PR had.

claude added 2 commits June 5, 2026 07:14

gemini-code-assist Bot reviewed Jun 6, 2026

View reviewed changes

Comment thread src/SharpInference.Cli/ImageCommand.cs

Comment thread samples/SharpInference.Sample.ToolCall/Program.cs

claude added 2 commits June 6, 2026 07:05

Merge remote-tracking branch 'origin/master' into claude/cli-args-lla…

b3f4409

…ma-cpp-Z9boe # Conflicts: # README.md

pekkah closed this Jun 14, 2026

pekkah deleted the claude/cli-args-llama-cpp-Z9boe branch June 14, 2026 08:56

pekkah mentioned this pull request Jun 14, 2026

feat(cli): llama.cpp-compatible --ngl/--device GPU selection flags #244

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli: match llama.cpp GPU flag conventions (--ngl, --device) and align flag names#144

cli: match llama.cpp GPU flag conventions (--ngl, --device) and align flag names#144
pekkah wants to merge 4 commits into
masterfrom
claude/cli-args-llama-cpp-Z9boe

pekkah commented Jun 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

pekkah commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pekkah commented Jun 6, 2026

What & why

Flag changes (text generation + image commands)

Device selection (--device, new)

Also updated

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

pekkah commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Device selection (`--device`, new)