Skip to content

Unused cls_token in PatchEmbeddingBlock #3454

Description

@night-gale

Describe the bug
When I was training the ViT with torch DistributedDataParallel, during backward, torch raises error and reports that

Parameters which did not receive grad for rank 0: vit.patch_embedding.cls_token

which means that the cls_token did not participate in the backward process.

I checked the implementation of ViT and PatchEmbeddingBlock and found the unused cls_token in monai.networks.blocks.patchembedding.py: PatchEmbeddingBlock.
image

To Reproduce
Steps to reproduce the behavior:

  1. set environment variable in shell TORCH_DISTRIBUTED_DEBUG=INFO
  2. train ViT with torch DistributedDataParallel

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions