Support MXINT4 scheme by mengniwang95 · Pull Request #1666 · intel/auto-round

mengniwang95 · 2026-04-07T09:36:19Z

Description

Support MXINT4 scheme

How to use:

model quantization:

CUDA_VISIBLE_DEVICES=0 auto-round --model /models/Llama-3.2-3B/ --scheme MXINT4 --iters 0 --format auto_round

inference with transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "tmp_autoround/Llama-3.2-3B-mxint-w4g32/"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
input_ids = tokenizer("Hello my name is", return_tensors="pt").input_ids.to(
    model.device
)
output = model.generate(input_ids, max_new_tokens=100)
print(tokenizer.decode(output[0]))

results:

BF16
hf ({'pretrained': '/models/Llama-3.1-8B-Instruct', 'add_bos_token': True}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 64

Tasks	Version	Filter	Metric		Value		Stderr
hellaswag	1	none	acc	↑	0.5977	±	0.0049
		none	acc_norm	↑	0.7954	±	0.0040
piqa	1	none	acc	↑	0.8014	±	0.0093
		none	acc_norm	↑	0.8101	±	0.0092

hf ({'pretrained': '/models/Llama-3.1-8B-Instruct', 'add_bos_token': True}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 16

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k_llama	3	flexible_extract	8	exact_match	↑	0.8620	±	0.0095
		strict_match	8	exact_match	↑	0.8567	±	0.0097
mmlu_llama	1	strict_match		exact_match	↑	0.6946	±	0.0037

MXINT4
hf ({'pretrained': 'tmp_autoround/Llama-3.1-8B-Instruct-mxint-w4g32/', 'add_bos_token': True}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 64

Tasks	Version	Filter	Metric		Value		Stderr
hellaswag	1	none	acc	↑	0.5460	±	0.0050
		none	acc_norm	↑	0.7396	±	0.0044
piqa	1	none	acc	↑	0.7535	±	0.0101
		none	acc_norm	↑	0.7693	±	0.0098

hf ({'pretrained': 'tmp_autoround/Llama-3.1-8B-Instruct-mxint-w4g32', 'add_bos_token': True}), gen_kwargs: ({}), limit: None, num_fewshot: None, batch_size: 16

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k_llama	3	flexible_extract	8	exact_match	↑	0.6285	±	0.0133
		strict_match	8	exact_match	↑	0.5663	±	0.0137
mmlu_llama	1	strict_match		exact_match	↑	0.5786	±	0.0040

Type of Change

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

hshen14 · 2026-04-10T06:05:38Z

@lvliang-intel pls review also.

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

mengniwang95 · 2026-04-10T08:54:30Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-10T08:54:40Z

Azure Pipelines successfully started running 1 pipeline(s).

lkk12014402

LGTM

mengniwang95 · 2026-04-13T09:22:00Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-13T09:22:09Z

Azure Pipelines successfully started running 1 pipeline(s).

mengniwang95 and others added 5 commits April 7, 2026 17:33

add mxint4

31b1135

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

refine code and add ut

1427510

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

update doc

fa398e7

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

add file

dec438b

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

b86dc11

for more information, see https://pre-commit.ci

mengniwang95 mentioned this pull request Apr 7, 2026

[Feature]: Experimental support MXINT4 scheme #1646

Open

mengniwang95 added 2 commits April 7, 2026 21:41

fix ut

3372e60

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

Merge branch 'main' into mengni/mx_int4

961822d

lkk12014402 mentioned this pull request Apr 10, 2026

[Feature] Enhance hadamard transform #1569

Open

hshen14 requested a review from lvliang-intel April 10, 2026 06:05

mengniwang95 requested a review from lkk12014402 April 10, 2026 06:11

mengniwang95 added 5 commits April 10, 2026 14:16

Update int4_utils.py

ea72329

Merge branch 'main' into mengni/mx_int4

5370e66

Update qlinear_int.py

9434da3

Update qlinear_int.py

1e5a6d7

Merge branch 'main' into mengni/mx_int4

403c56f

mengniwang95 requested a review from wenhuach21 April 10, 2026 06:32

wenhuach21 reviewed Apr 10, 2026

View reviewed changes

Comment thread auto_round/experimental/qmodules/mxint4_utils.py

wenhuach21 reviewed Apr 10, 2026

View reviewed changes

Comment thread auto_round/inference/backend.py Outdated

update file name and packing format

a92b4eb

Signed-off-by: Mengni Wang <mengni.wang@intel.com>

wenhuach21 reviewed Apr 13, 2026

View reviewed changes

Comment thread docs/step_by_step_CN.md

wenhuach21 reviewed Apr 13, 2026

View reviewed changes

Comment thread auto_round/formats.py

wenhuach21 reviewed Apr 13, 2026

View reviewed changes

Comment thread auto_round/inference/backend.py Outdated

wenhuach21 reviewed Apr 13, 2026

View reviewed changes

Comment thread auto_round/__main__.py

wenhuach21 approved these changes Apr 13, 2026

View reviewed changes

lkk12014402 reviewed Apr 13, 2026

View reviewed changes

Comment thread auto_round/inference/backend.py

lkk12014402 approved these changes Apr 13, 2026

View reviewed changes

mengniwang95 added 3 commits April 13, 2026 14:58

update doc

6137436

Merge branch 'main' into mengni/mx_int4

30911e7

Update backend.py

e8e3d59

mengniwang95 merged commit c817d49 into main Apr 13, 2026
42 checks passed

mengniwang95 deleted the mengni/mx_int4 branch April 13, 2026 13:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support MXINT4 scheme#1666

Support MXINT4 scheme#1666
mengniwang95 merged 16 commits intomainfrom
mengni/mx_int4

mengniwang95 commented Apr 7, 2026 •

edited

Loading

Uh oh!

hshen14 commented Apr 10, 2026

Uh oh!

Uh oh!

Uh oh!

mengniwang95 commented Apr 10, 2026

Uh oh!

azure-pipelines bot commented Apr 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lkk12014402 left a comment

Uh oh!

mengniwang95 commented Apr 13, 2026

Uh oh!

azure-pipelines bot commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mengniwang95 commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Uh oh!

hshen14 commented Apr 10, 2026

Uh oh!

Uh oh!

Uh oh!

mengniwang95 commented Apr 10, 2026

Uh oh!

azure-pipelines bot commented Apr 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lkk12014402 left a comment

Choose a reason for hiding this comment

Uh oh!

mengniwang95 commented Apr 13, 2026

Uh oh!

azure-pipelines bot commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mengniwang95 commented Apr 7, 2026 •

edited

Loading