-
Notifications
You must be signed in to change notification settings - Fork 19
Issue with INT4 quantization #22
Description
Hello,
I am trying to quantize in INT4 all the weights of a mobilenet.
My attempt crashes at this line:
quark/onnx/quantization/quant_utils.py", line 254, in tensor_type
raise ValueError(f"Unexpected value qtype={self!r}.")
ValueError: Unexpected value qtype=<ExtendedQuantType.QInt4: 5>.
For doing this, I needed to define my own class Int4Spec(QTensorConfig) and also my own type INT4
class Int4(BaseInt4):
onnx_proto_dtype: TensorProto.INT4
map_onnx_format = ExtendedQuantType.QInt4
Then I used
config = QConfig(global_config=QLayerConfig(activation=Int8Spec(), weight=Int4Spec())
and then I have something like this:
qk_quantizer = ModelQuantizer(config)
dr = ImageDataReader(quantization_samples=data, model_path=model_path)
print(f"[INFO] : Running ONNX quantization on {model_path}")
qk_quantizer.quantize_model(model_path, qk_quantized_model_path, dr)
Could you help?
BR