Granite speech - minor fixes to support training with the HF trainer#38833
Conversation
avoid unused parameters that DDP does not like
trainers often pass this argument automatically
this ensures save_pretrained will not crash when saving the processor during training https://github.com/huggingface/transformers/blob/d5d007a1a0f0c11a726a54c8f00bd71825f84d02/src/transformers/feature_extraction_utils.py#L595
| **kwargs, | ||
| ): | ||
| super().__init__(**kwargs) | ||
| self.sampling_rate = sampling_rate |
There was a problem hiding this comment.
I think this isn't used currently
There was a problem hiding this comment.
Right, it's not used. I added it to stay consistent with other audio feature extractors that have this property.
avihu111
left a comment
There was a problem hiding this comment.
Added some comments on each change, giving relevant context
| audio_inputs = {} | ||
|
|
||
| text_inputs = self.tokenizer(prompt_strings, padding=True, **kwargs) | ||
| if "padding" not in kwargs: |
There was a problem hiding this comment.
avoids a crash when trainers pass padding=True to the processor
|
|
||
| query_output = self.qformer( | ||
| query_embeds=self.query.data, | ||
| query_embeds=self.query, |
There was a problem hiding this comment.
Bugfix. When using .data this trainable parameter did not receive gradients.
| # Currently lazily initialized | ||
| self.melspec = None | ||
| requires_backends(self, ["torchaudio"]) | ||
| self.mel_filters = torchaudio.transforms.MelSpectrogram(**self.melspec_kwargs) |
There was a problem hiding this comment.
removed the lazy init, and renamed it to mel_filters. This specific name avoids a crash when serializing the processor.
ArthurZucker
left a comment
There was a problem hiding this comment.
lgtm sorry for the delay!
|
Awesome! Thanks @ArthurZucker! |
What does this PR do?
Minor updates to
granite_speechto enable finetuning it with HF trainers.padding=Trueto the processor.datafrom a forward callmelspectomel_filtersto leverage this, which avoids a crash onsave_pretrainedBefore submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
CC: @ArthurZucker @eustlb @alex-jw-brooks @avishaiElmakies @gsaon