Skip to content

Conversation

@BenjaminBossan
Copy link
Member

@BenjaminBossan BenjaminBossan commented Dec 5, 2025

This adds the possibility to convert a non-LoRA adapter into a LoRA adapter. Not all LoRA adapters will support this, but many will.

Conversion is not precise, there will be a loss of performance. The higher the rank, the lower the loss, but also the less efficient the adapter. Also, for now, this only supports linear layers. Still, this has some advantages:

  • In PEFT, LoRA supports more features than most other methods, e.g. mixed adapter batches. Thus the converted adapter can be used with those features.
  • Some downstream packages support LoRA adapters, but not other PEFT methods, e.g. Diffusers. The conversion allows to use a non-LoRA adapter with those packages.

Users can pass a fixed rank for the LoRA adapter or a float that will use a dynamic rank based on the threshold of the contribution of the singular values. I ran an experiment with converting a LoKr model trained on MetaMath to LoRA. The LoKr model's test accuracy was 37.5%. This is the test accuracy of the converted LoRAs:

rank num trainable params (M) test accuracy (%)
32 9 6.9
64 18 18.3
128 37 20.5
256 73 35.0
0.3 20 15.9
0.4 32 22.2
0.5 47 27.7
0.6 67 32.5
0.7 95 35.9
0.8 135 36.7

For the future

  • Convert PEFT methods that are non-trivial, like IA³; in most cases, that shouldn't be too hard
  • Convert layers other than linear layers

For PEFT methods that do not work according to W' = W_0 + dW, we have to find some workarounds. So let's say the method uses W' = W_0 * dW instead. We can reformulate this as W' = W_0 * dW = W_0 + dW2. As we're actually interested in dW2, we can isolate this as dW2 = W_0 * dW - W_0. I would suggest that we implement a new method on the tuner layer to implement this:

def get_additive_delta_weigh(self, adapter_name):
    delta_weight = self.get_delta_weight(adapter_name)
    base_weight = self.get_base_layer().weight
    return base_weight * delta_weight - base_weight

Then, in the LoRA conversion function:

if hasattr(module, "get_additive_delta_weigh"):
    delta_weight = module.get_additive_delta_weigh(adapter_name)
else:
    delta_weight = module.get_delta_weight(adapter_name)

This way, we can continue using get_delta_weight where applicable while leaving the door open for PEFT methods to implement a custom method specifically for LoRA conversion. LMK if you have a better idea for how to implement that.

Of course, there can be PEFT methods that cannot be expressed at all in terms of dW, which will be impossible to convert (most prominently the prompt learning methods).

TODO

  • correctly take care of modules_to_save
  • deal with bias argument

Unrelated changes

  • I noticed that the VB-LoRA layer had no __repr__, so it was added.

This adds the possibility to convert a non-LoRA adapter into a LoRA
adapter. Not all LoRA adapters will support this, but many will.

Conversion is not precise, there will be a loss of performance. The
higher the rank, the lower the loss, but also the less efficient the
adapter. Also, for now, this only supports linear layers. Still, this
has some advantages:

- In PEFT, LoRA supports more features than most other methods, e.g.
mixed adapter batches. Thus the converted adapter can be used with those
features.
- Some downstream packages support LoRA adapters, but not other PEFT
methods, e.g. Diffusers. The conversion allows to use a non-LoRA adapter
with those packages.

TODOs:

- Deal with no PEFT config
- Right now, only LoKr support is there, add more
- Documentation

For the future:

- Convert PEFT methods that are non-trivial, like IA³
- Convert layers other than linear layers
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan BenjaminBossan changed the title [WIP] FEAT Add function to convert to LoRA [WIP] FEAT Add function to convert non-LoRA PEFT adapters to LoRA Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants