Question About Reproducing W4A4 Results in the Paper, FP32/W8A8/W4A8 Reproduced Successfully, but W4A4 failed

**1.Results**
Hi, thanks for releasing the code — it's a really nice work.

I successfully reproduced the FP32, W8A8, and W4A8 results on LSUN-Bedrooms, but I am having significant difficulty reproducing the W4A4 results.

As shown in the table below, the FP32/W8A8/W4A8 results are relatively close to the reported values, while W4A4 shows a much larger gap.

| Dataset              | Setting                | FID   |
| -------------------- | ---------------------- | ----- |
| LSUNBedrooms (LDM-4) | FP32 (my reproduction) | 3.03  |
| LSUNBedrooms (LDM-4) | FP32 (paper)           | 2.95  |
| LSUNBedrooms (LDM-4) | W8A8 (my reproduction) | 2.93  |
| LSUNBedrooms (LDM-4) | W8A8 (paper)           | 3.03  |
| LSUNBedrooms (LDM-4) | W4A8 (my reproduction) | 3.40  |
| LSUNBedrooms (LDM-4) | W4A8 (paper)           | 3.26  |
| LSUNBedrooms (LDM-4) | **W4A4 (my reproduction)** | **11.36** |
| LSUNBedrooms (LDM-4) | **W4A4 (paper)**           | **5.64**  |

**2.Command:**
python sample_diffusion_ldm_bedroom.py \
    -r models/ldm/lsun_beds256/model.ckpt \
    -n 50000 \
    --batch_size 20 \
    -c 200 \
    -e 1.0 \
    --seed 40 \
    --ptq \
    --weight_bit 4 \
    --quant_mode qdiff \
    --cali_st 20 \
    --cali_batch_size 32 \
    --cali_n 256 \
    --quant_act \
    --act_bit 4 \
    --a_sym \
    --a_min_max \
    --cali_data_path datasets/bedroom_sample2040_allst.pt \
    -l results/QuEST/bedroom/w4a4

**3.Additional Notes**

I noticed the previous issue:"Strange Results Produced on W4A4 LSUN-Bedrooms Dataset Calibration", 
Following the discussion there: I **did NOT use running_stat when executing w4a4 quantization**, and I set aq_params['channel_wise'] = True.
However, the W4A4 result is still significantly worse than the paper.

I also tested LSUN-Churches with W4A4: the generated samples are mostly corrupted. For LSUN-Bedrooms, although the FID is worse than reported result in paper, it still generates semantically reasonable images. 

I also tried to investigate the issue by reading the source code myself, but I could not identify the cause.

Could you please give some advice on whether there are any additional hyperparameters, calibration settings, or implementation details required to reproduce the reported W4A4 results?

Thanks again for the great work.

**4.Generated W4A4 samples**

W4A4_LSUN_Bedroom
<img width="256" height="256" alt="Image" src="https://github.com/user-attachments/assets/64c6724e-2fd0-4154-a019-13a74201e9ec" /> <img width="256" height="256" alt="Image" src="https://github.com/user-attachments/assets/57ce230a-cc26-4f41-bcdf-292dc989de29" />

W4A4_LSUN_Churches
<img width="256" height="256" alt="Image" src="https://github.com/user-attachments/assets/2bdf20c6-ea98-4170-a23f-a5ffd37b11e2" /> <img width="256" height="256" alt="Image" src="https://github.com/user-attachments/assets/9c9d2536-2892-4c25-badb-65d1b3f4f5e5" />


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question About Reproducing W4A4 Results in the Paper, FP32/W8A8/W4A8 Reproduced Successfully, but W4A4 failed #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Dataset	Setting	FID
LSUNBedrooms (LDM-4)	FP32 (my reproduction)	3.03
LSUNBedrooms (LDM-4)	FP32 (paper)	2.95
LSUNBedrooms (LDM-4)	W8A8 (my reproduction)	2.93
LSUNBedrooms (LDM-4)	W8A8 (paper)	3.03
LSUNBedrooms (LDM-4)	W4A8 (my reproduction)	3.40
LSUNBedrooms (LDM-4)	W4A8 (paper)	3.26
LSUNBedrooms (LDM-4)	W4A4 (my reproduction)	11.36
LSUNBedrooms (LDM-4)	W4A4 (paper)	5.64

Question About Reproducing W4A4 Results in the Paper, FP32/W8A8/W4A8 Reproduced Successfully, but W4A4 failed #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions