Fix: Respect `inference_mode` when setting adapters with `modules_to_save` (Issue #2928) #2931

ada-ggf25 · 2025-11-29T13:09:00Z

Fix: Respect `inference_mode` when setting adapters with `modules_to_save`

Description

This PR fixes issue #2928 where modules_to_save had requires_grad=True even when inference_mode=True was passed to set_adapter() caused issues when using quantized models (e.g., with bitsandbytes) in inference mode, as the quantized layers require parameters to have requires_grad=False.

Problem

When calling model.set_adapter(adapter_name, inference_mode=True) with a model that has modules_to_save configured (e.g., a classifichead), the modules_to_save parameters would still have requires_grad=True despite being in inference mode. This happened because:

_set_adapter() was calling module.enable_adapters(True) unconditionally
ModulesToSaveWrapper.enable_adapters(True) sets requires_grad_(True) for all active adapters
This occurred before set_adapter() was called with inference_mode, creating a conflict

Solution

The fix ensures that enable_adapters() is only called when not in inference mode:

Modified _set_adapter() in src/peft/utils/other.py:
- Added conditional check: only call enable_adapters(True) when inference_mode=False
- This prevents setting gradients to True when inference mode should keep them False
Updated PeftModel.set_adapter() in src/peft/peft_model.py:
- Added inference_mode parameter to match the API of PeftMixedModel.set_adapter()
- Passes inference_mode to both _set_adapter() and base_model.set_adapter()
Added comprehensive tests in tests/test_other.py:
- test_modules_to_save_inference_mode_requires_grad_false: Verifies requires_grad=False in inference mode
- test_modules_to_save_training_mode_requires_grad_true: Verifies requires_grad=True in training mode
- test_modules_to_save_inference_mode_with_torch_inference_mode: Verifies compatibility with torch.inference_mode()

Changes Made

Code Changes

src/peft/utils/other.py:

Modified _set_adapter() to conditionally call enable_adapters() based on inference_mode parameter

src/peft/peft_model.py:

Updated set_adapter() method signature to accept inference_mode parameter
Passes inference_mode to underlying adapter setting functions

tests/test_other.py:

Added TestModulesToSaveInferenceMode test class with 3 comprehensive tests

Testing

Test Results

All new tests pass:

test_modules_to_save_inference_mode_requires_grad_false - PASSED
test_modules_to_save_training_mode_requires_grad_true - PASSED
test_modules_to_save_inference_mode_with_torch_inference_mode - PASSED

All existing modules_to_save tests pass (11/11)

Related tests pass (71/74 - 3 failures are unrelated BOFT dependency issues)

Code quality checks pass (make quality)

Test Coverage

The new tests verify:

modules_to_save parameters correctly have requires_grad=False when inference_mode=True
modules_to_save parameters correctly have requires_grad=True when inference_mode=False (training mode)
Compatibility with torch.inference_mode() context manager

Example Usage

Before this fix, the following code would fail with quantized models:

model = PeftModel.from_pretrained(base_model, adapter_path)
model = convert_to_int8(model)  # Quantization
model.eval()

with torch.inference_mode():
    model.set_adapter("my_adapter", inference_mode=True)  #  modules_to_save still had requires_grad=True
    _ = model(batch)

After this fix:

model = PeftModel.from_pretrained(base_model, adapter_path)
model = convert_to_int8(model)  # Quantization
model.eval()

with torch.inference_mode():
    model.set_adapter("my_adapter", inference_mode=True)  #  modules_to_save correctly have requires_grad=False
    _ = model(batch)

Add optional inference_mode parameter to PeftModel.set_adapter() method to allow setting adapters in frozen state (requires_grad=False) directly without manual parameter manipulation. Changes: - Add inference_mode parameter with default value False to maintain backwards compatibility - Update method docstring to document the new parameter and clarify that adapters are set to trainable unless inference_mode is True - Remove manual example code snippet showing how to set requires_grad=False - Pass inference_mode parameter to base_model.set_adapter() and _set_adapter() helper function calls This enhancement simplifies the workflow for users who want to set adapters in inference mode, addressing the need to manually manipulate requires_grad flags after setting an adapter.

…n _set_adapter Fix bug where enable_adapters() was called unconditionally in _set_adapter() function, causing adapters to be incorrectly enabled even when inference_mode=True. Changes: - Conditionally call enable_adapters(True) only when inference_mode is False when adapter is found in module._adapters - Conditionally call enable_adapters(False) only when inference_mode is False when adapter is not found in module._adapters - Ensure that when inference_mode=True, adapters remain in their current enabled/disabled state and are not forcibly toggled This fix ensures that the inference_mode parameter is properly respected throughout the adapter setting process, preventing unintended adapter activation during inference operations.

Add comprehensive test suite to validate that modules_to_save correctly respect the inference_mode parameter when set_adapter is called. This test class addresses issue huggingface#2928 where modules_to_save had requires_grad=True even when inference_mode=True was passed to set_adapter. Test coverage: - test_modules_to_save_inference_mode_requires_grad_false: Verifies that modules_to_save parameters have requires_grad=False when inference_mode=True is passed to set_adapter, ensuring parameters are frozen during inference - test_modules_to_save_training_mode_requires_grad_true: Verifies that modules_to_save parameters have requires_grad=True when inference_mode=False is passed to set_adapter, ensuring parameters are trainable during training - test_modules_to_save_inference_mode_with_torch_inference_mode: Validates that modules_to_save work correctly when used with torch.inference_mode() context manager and that forward passes still function correctly All tests use AutoModelForSequenceClassification with LoRA configuration targeting query and value modules, with classifier as modules_to_save to provide realistic test scenarios.

Reformat the docstring comment in TestModulesToSaveInferenceMode class to fit within line length limits by combining two lines into a single line. This is a minor formatting change to improve code readability and compliance with project style guidelines.

… in ModulesToSaveWrapper Remove the lines that set requires_grad on original_module in ModulesToSaveWrapper.enable_adapters() method. This change addresses the maintainer's feedback that there is no reason to touch the requires_grad of the original_module here, and it conflicts with bitsandbytes quantization which requires gradients to be False at all times. The original_module's requires_grad is no longer manipulated by enable_adapters(), only modules_to_save gradients are managed. Updated test_requires_grad_modules_to_save_disabling to reflect this change by removing expectations about original_module having gradients when adapters are disabled. Related to issue huggingface#2928 and PR huggingface#2931.

ada-ggf25 added 4 commits November 29, 2025 11:15

ada-ggf25 mentioned this pull request Nov 29, 2025

Inference mode with Module_to_save LoRA #2928

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Respect `inference_mode` when setting adapters with `modules_to_save` (Issue #2928) #2931

Fix: Respect `inference_mode` when setting adapters with `modules_to_save` (Issue #2928) #2931

Uh oh!

ada-ggf25 commented Nov 29, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix: Respect inference_mode when setting adapters with modules_to_save (Issue #2928) #2931

Are you sure you want to change the base?

Fix: Respect inference_mode when setting adapters with modules_to_save (Issue #2928) #2931

Uh oh!

Conversation

ada-ggf25 commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix: Respect inference_mode when setting adapters with modules_to_save

Description

Problem

Solution

Changes Made

Code Changes

Testing

Test Results

Test Coverage

Example Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix: Respect `inference_mode` when setting adapters with `modules_to_save` (Issue #2928) #2931

Fix: Respect `inference_mode` when setting adapters with `modules_to_save` (Issue #2928) #2931

ada-ggf25 commented Nov 29, 2025 •

edited

Loading

Fix: Respect `inference_mode` when setting adapters with `modules_to_save`