Add compressed-tensors format export support for W4A16 and W8A16 by thuang6 · Pull Request #1669 · intel/auto-round

thuang6 · 2026-04-09T03:56:13Z

Description

Added compressed-tensors format export support for W4A16 and W8A16,
Replaced previous INT W8A8 support from internal NaiveQuantizationCompressor interface with new compress_module interface (require >=0.15.0)

updated PR to use BaseCompressor class method to be compatiable with old version

Type of Change

Related Issues

Fixes or relates to #1567

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

# Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Adds llm_compressor (compressed-tensors) export support for INT weight-only schemes (W4A16, W8A16), and updates docs/tests accordingly.

Changes:

Extend llm_compressor format to accept W4A16/W8A16 and route them through a new backend path.
Update compressed-tensors scheme construction to omit activation quantization for weight-only exports.
Add/adjust CPU export tests and document the newly supported schemes (EN + CN).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`auto_round/formats.py`	Adds W4A16/W8A16 to `llm_compressor` support and introduces a WOQ backend selector (`wint_a16`).
`auto_round/export/export_to_llmcompressor/export.py`	Treats W*A16 as weight-only in compressed-tensors scheme creation; tightens dependency expectations around `compress_module`.
`auto_round/compressors/utils.py`	Adds helper to detect integer weight-only quantization (WOQ).
`test/test_cpu/export/test_export.py`	Refactors INT8_W8A8 export test and adds new W4A16/W8A16 llm_compressor export assertions.
`README.md`	Documents `llm_compressor` support for FP8_BLOCK, INT8_W8A8, W4A16, W8A16.
`README_CN.md`	Mirrors the README support-matrix update in Chinese.

yiliu30

LGTM

…into thuang6/int4-ct

for more information, see https://pre-commit.ci

…into thuang6/int4-ct

Co-authored-by: Yi Liu <yi4.liu@intel.com>

thuang6 · 2026-04-13T02:17:48Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-13T02:17:58Z

Azure Pipelines successfully started running 1 pipeline(s).

…into thuang6/int4-ct

thuang6 · 2026-04-13T04:00:25Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-13T04:00:49Z

Azure Pipelines successfully started running 1 pipeline(s).

thuang6 · 2026-04-13T05:45:29Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-13T05:45:40Z

Azure Pipelines successfully started running 1 pipeline(s).

thuang6 added 2 commits April 9, 2026 11:35

add CT format export for W4A16 and W8A16

9a5a966

thuang6 added this to the 0.13.0 milestone Apr 9, 2026

thuang6 requested review from Copilot, wenhuach21, xin3he and yiliu30 April 9, 2026 03:56

[pre-commit.ci] auto fixes from pre-commit.com hooks

045bfff

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of thuang6 April 9, 2026 03:56 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Comment thread test/test_cpu/export/test_export.py

Comment thread test/test_cpu/export/test_export.py Outdated

Comment thread auto_round/export/export_to_llmcompressor/export.py Outdated

Comment thread auto_round/export/export_to_llmcompressor/export.py Outdated

wenhuach21 approved these changes Apr 9, 2026

View reviewed changes

Comment thread auto_round/compressors/utils.py Outdated

Comment thread auto_round/compressors/utils.py Outdated

Comment thread README_CN.md Outdated

yiliu30 approved these changes Apr 9, 2026

View reviewed changes

Comment thread README.md Outdated

Comment thread README_CN.md Outdated

XuehaoSun and others added 9 commits April 9, 2026 16:49

Merge branch 'main' into thuang6/int4-ct

ded51d5

refactor to be compatible with multiple CT version

09eaa09

Merge remote-tracking branch 'origin/main' into thuang6/int4-ct

255d2b8

Merge branch 'thuang6/int4-ct' of https://github.com/intel/auto-round …

9e3fa00

…into thuang6/int4-ct

[pre-commit.ci] auto fixes from pre-commit.com hooks

104dcda

for more information, see https://pre-commit.ci

update woq check condition

0c1f2bd

Merge branch 'thuang6/int4-ct' of https://github.com/intel/auto-round …

4a7bb6a

…into thuang6/int4-ct

Apply doc change suggestions from review

8166213

Co-authored-by: Yi Liu <yi4.liu@intel.com>

Merge branch 'main' into thuang6/int4-ct

9395153

thuang6 added 2 commits April 13, 2026 11:52

remove hard-coded test save dir

1e3fc92

Merge branch 'thuang6/int4-ct' of https://github.com/intel/auto-round …

8a2a555

…into thuang6/int4-ct

xin3he approved these changes Apr 13, 2026

View reviewed changes

thuang6 merged commit bbda38a into main Apr 13, 2026
42 checks passed

thuang6 deleted the thuang6/int4-ct branch April 13, 2026 06:21

Conversation

thuang6 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yiliu30 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

thuang6 commented Apr 13, 2026

Uh oh!

azure-pipelines bot commented Apr 13, 2026

Uh oh!

thuang6 commented Apr 13, 2026

Uh oh!

azure-pipelines bot commented Apr 13, 2026

Uh oh!

thuang6 commented Apr 13, 2026

Uh oh!

azure-pipelines bot commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

thuang6 commented Apr 9, 2026 •

edited

Loading