Issue with INT4 quantization

Hello, 

I am trying to quantize in INT4 all the weights of a mobilenet.
My attempt crashes at this line: 
quark/onnx/quantization/quant_utils.py", line 254, in tensor_type
    raise ValueError(f"Unexpected value qtype={self!r}.")
ValueError: Unexpected value qtype=<ExtendedQuantType.QInt4: 5>.

For doing this, I needed to define my own class Int4Spec(QTensorConfig) and also my own type INT4 
class Int4(BaseInt4):
    onnx_proto_dtype: TensorProto.INT4 
    map_onnx_format = ExtendedQuantType.QInt4

Then I used 
config = QConfig(global_config=QLayerConfig(activation=Int8Spec(), weight=Int4Spec())

and then I have something like this:
qk_quantizer = ModelQuantizer(config)
dr = ImageDataReader(quantization_samples=data, model_path=model_path)
print(f"[INFO] : Running ONNX quantization on {model_path}")
qk_quantizer.quantize_model(model_path, qk_quantized_model_path, dr)

Could you help?
BR


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with INT4 quantization #22

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with INT4 quantization #22

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions