Unused cls_token in PatchEmbeddingBlock

**Describe the bug**
When I was training the `ViT` with `torch DistributedDataParallel`, during backward, `torch` raises error and reports that 
```
Parameters which did not receive grad for rank 0: vit.patch_embedding.cls_token
```
which means that the `cls_token` did not participate in the backward process.

I checked the implementation of `ViT` and `PatchEmbeddingBlock` and found the unused `cls_token` in `monai.networks.blocks.patchembedding.py: PatchEmbeddingBlock`.
![image](https://user-images.githubusercontent.com/52153974/145175732-6497f7a3-413b-4999-9843-feb0fec19b81.png)

**To Reproduce**
Steps to reproduce the behavior:
1. set environment variable in shell `TORCH_DISTRIBUTED_DEBUG=INFO`
2. train `ViT` with `torch DistributedDataParallel`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unused cls_token in PatchEmbeddingBlock #3454

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Unused cls_token in PatchEmbeddingBlock #3454

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions