Skip to content
This repository was archived by the owner on Feb 24, 2026. It is now read-only.

[Bugfix] Enhance LowerAsyncCopy Pass to handle INT8 dma copy with predicate #219

Merged
LeiWang1999 merged 7 commits into
microsoft:mainfrom
LeiWang1999:fix-async-copy-on-bitnet
Oct 11, 2024
Merged

[Bugfix] Enhance LowerAsyncCopy Pass to handle INT8 dma copy with predicate #219
LeiWang1999 merged 7 commits into
microsoft:mainfrom
LeiWang1999:fix-async-copy-on-bitnet

Conversation

@LeiWang1999
Copy link
Copy Markdown
Contributor

This pull request includes several changes across multiple files to enhance error handling, improve GPU matrix multiplication logic, and update integration benchmarks. The most important changes include increasing the maximum error message length, refining the logic for GPU matrix operations, and updating integration benchmarks.

Error Handling Improvements:

  • Increased MAX_ERROR_MESSAGE_LENGTH from 200 to 500 in bitblas/common.py.

GPU Matrix Multiplication Logic Enhancements:

  • Refined the condition to check block_reduction_depth and added a default value of 1 if block_reduction_depth is None in bitblas/gpu/matmul_mma_dequantize.py. [1] [2]
  • Updated thread binding and loop splitting logic based on the reduce_k value in bitblas/gpu/matmul_mma_dequantize.py. [1] [2] [3] [4] [5] [6] [7] [8]

Integration Benchmark Updates:

  • Updated integration benchmarks to use model.quantize() and torch.compile(model) in integration/BitNet/benchmark_inference_latency.py.

Import Optimization:

  • Optimized imports in integration/pytorch/bitblas_linear.py by updating the import statement for MatmulConfig and Matmul.

Submodule Update:

  • Updated the submodule commit for 3rdparty/tvm.

Ref to Issue #218

@LeiWang1999 LeiWang1999 marked this pull request as ready for review October 11, 2024 11:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant