Skip to content

Bug #40833: Fix for kv_offset calculation for mixed padding#40877

Open
preethamyerramsetty wants to merge 2 commits into
huggingface:mainfrom
preethamyerramsetty:fix-left-padding-attention
Open

Bug #40833: Fix for kv_offset calculation for mixed padding#40877
preethamyerramsetty wants to merge 2 commits into
huggingface:mainfrom
preethamyerramsetty:fix-left-padding-attention

Conversation

@preethamyerramsetty

Copy link
Copy Markdown

What does this PR do?

This fixes the kv_offset calculation in cache_utils.py in order to handle left and mixed padding correctly. Previously in case of mixed left and right padding the model could attend to padded tokens which results in incorrect response.

This PR ensures that correct offset is used for left padding, whereas right and mixed paddingfallback to offset 0 and "test_cache_utils.py" is also added to test all padding scenarios.

Fixes #40833

Before submitting

Who can review?

Anyone in the community is free to review.

Suggested reviewer: @giulio98

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@preethamyerramsetty

Copy link
Copy Markdown
Author

Hi @giulio98, this PR fixes the kv padding issue with Bug #40833 and I have also added the tests for left, right and mixed padding. Can you please review it and get back to me sir. Thank you.

@ArthurZucker ArthurZucker left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hello, there is probably an issue because the only diff is string, not code 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generate attends to padded tokens when using left-padded (KV) batches

2 participants