Fix dk/dv autograd error on TPU flash attention #8685

zmelumian972 · 2025-02-06T14:05:39Z

grads not extracting on flash attention if autograd activated on dv

flash attention backward function on TPU will only return keys and values gradients if the key gradients is requested by torch.autograd

…ed on dv flash attention backward function on TPU will only return keys and values gradients if the key gradients is requested by torch.autograd

lsy323

Thank you for the fix!

Fix dk/dv grads not extracting on flash attention if autograd activat…

c695bd7

…ed on dv flash attention backward function on TPU will only return keys and values gradients if the key gradients is requested by torch.autograd

lsy323 approved these changes Feb 6, 2025

View reviewed changes

qihqi approved these changes Feb 6, 2025

View reviewed changes

qihqi merged commit 0cd1fc2 into pytorch:master Feb 7, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix dk/dv autograd error on TPU flash attention #8685

Fix dk/dv autograd error on TPU flash attention #8685

Uh oh!

zmelumian972 commented Feb 6, 2025

Uh oh!

lsy323 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix dk/dv autograd error on TPU flash attention #8685

Fix dk/dv autograd error on TPU flash attention #8685

Uh oh!

Conversation

zmelumian972 commented Feb 6, 2025

Uh oh!

lsy323 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants