Qualcomm AI Engine Direct - Quantizer refine for qat#6513
Qualcomm AI Engine Direct - Quantizer refine for qat#6513cccclai merged 4 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6513
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 086cca4 with merge base 41a57e6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hi @cccclai, We implement the draft PR for
We have another local branch which is testing QAT mobilenet v2 based on it. Yet we encounter some problems. Thank you. :) |
|
@navsud can you review this PR? |
| from torch.ao.quantization.observer import UniformQuantizationObserverBase | ||
|
|
||
|
|
||
| class ParamObserver(UniformQuantizationObserverBase): |
There was a problem hiding this comment.
This looks great and is generic enough to be added into torch/ao/quantization/observer.py. If possible, please move it there as a follow-up.
How about renaming it to PerChannelParamObserver() or PerChannelWeightObserver?
There was a problem hiding this comment.
It's great to see you like it! Change the name and add a TODO for it
| qscheme=torch.per_tensor_symmetric, | ||
| ch_axis=0, | ||
| reduce_range=True, | ||
| observer=MovingAverageMinMaxObserver, |
There was a problem hiding this comment.
This should be PerChannelMovingAverageMinMaxObserver and qscheme=torch.per_channel_symmetric?
There was a problem hiding this comment.
Sorry for misunderstanding. This quant config is for per tensor specifically. The per channel one is here, and we use the condition of a quantizer member function here to assign per channel quant confing.
| get_default_8bit_qat_proto, | ||
| get_default_8bit_qnn_ptq_config, | ||
| get_8a8w_qnn_ptq_config, | ||
| get_8a8w_qnn_qat_config, |
There was a problem hiding this comment.
As a follow-up PR, can we unify/simplify: get_8a8w_qnn_ptq_config() and get_8a8w_qnn_qat_config() into get_8a8w_qnn_config() with the argument is_train=True/False, which will define whether we are doing PTQ vs. QAT.
There was a problem hiding this comment.
No problem. We would like to keep separated at first. Once it seems to be stable we will raise a PR to merge them.
Hi @navsud Thank you very much for reviewing! We just uploaded a patch based on comments. Just two questions we would like to ask.
Thank you for reading! If anything is unclear please feel free to let me know. I will try to describe it more. :D |
- Reorginize qualcomm/quantizer - Split quantizer/utils.py to -- qconfig -- annotators -- observers directory - Change coresponding callees - Rename get_default_Nbit_qnn_ptq_config to get_NaNw_qnn_ptq_config - Add 16a4w conv test* (It is not compared with original model)
- Move and rename param_observer.py to per_channel_param_observer.py - Add todo to merge qconfig
- Add todo for per_channel_param_observer.py
d542309 to
0e57c97
Compare
|
Hi @cccclai, may you take a glance at this PR when you are free, and perhas merge it, if it looks good to you? Thanks :D |
cccclai
left a comment
There was a problem hiding this comment.
Looks good to me - thank you for adding the feature!
|
I notice that I didn't import to internal...in the worst case, if this diff breaks some tests, we may need to revert and reland it... |
It's fine. If so just let me know what should I fix and I will do it asap. :) |
This reverts commit 068f43c.
|
@chunit-quic Unfortunately it's reverted because it breaks internal test....can you submit a PR again? |
No problem. The new PR is PR6747 |
-- qconfig
-- annotators
-- observers directory