Skip to content
This repository was archived by the owner on Feb 24, 2026. It is now read-only.
This repository was archived by the owner on Feb 24, 2026. It is now read-only.

running issues #131

@brisker

Description

@brisker

python 3.10
cuda 12.1

I just pip install bitblas and run :python -c "import bitblas; print(bitblas.__version__)" , and it gives me:
0.0.1.dev13

and I run this basic code:

import bitblas
import torch



matmul_config = bitblas.MatmulConfig(
    M=1,  # M dimension
    N=2048,  # N dimension
    K=1024,  # K dimension
    A_dtype="float16",  # activation A dtype
    W_dtype="int4",  # weight W dtype
    accum_dtype="float16",  # accumulation dtype
    out_dtype="float16",  # output dtype
    layout="nt",  # matrix layout, "nt" indicates the layout of A is non-transpose and the layout of W is transpose
    with_bias=False,  # bias
    # configs for weight only quantization
    group_size=None,  # setting for grouped quantization
    with_scaling=False,  # setting for scaling factor
    with_zeros=False,  # setting for zeros
    zeros_mode=None,  # setting for how to calculating zeros
)

matmul = bitblas.Matmul(config=matmul_config)


input_tensor = torch.rand((1, 1024), dtype=torch.float16).cuda()
weight_tensor = torch.randint(0, 7, (2048, 1024), dtype=torch.int8).cuda()


weight_tensor_int4 = matmul.transform_weight(weight_tensor)


output_tensor = matmul(input_tensor, weight_tensor_int4)


ref_result = torch.matmul(input_tensor, weight_tensor.t().to(torch.float16))

print("Ref output:", ref_result)
print("BitBLAS output:", output_tensor)
torch.testing.assert_close(output_tensor, ref_result, rtol=1e-2, atol=1e-0)

and the weird thing is that, the running result gives me:

Ref output: tensor([[1494., 1461., 1529.,  ..., 1508., 1525., 1446.]], device='cuda:0',
       dtype=torch.float16)
BitBLAS output: tensor([[0., 0., 0.,  ..., 0., 0., 0.]], device='cuda:0', dtype=torch.float16)
Traceback (most recent call last):
  File "/data1/speed_test/new_bitblas_test.py", line 41, in <module>
    torch.testing.assert_close(output_tensor, ref_result, rtol=1e-2, atol=1e-0)
  File "/opt/python-3.10.12/lib/python3.10/site-packages/torch/testing/_comparison.py", line 1520, in assert_close
    raise error_metas[0].to_error(msg)
AssertionError: Tensor-likes are not close!

Mismatched elements: 2048 / 2048 (100.0%)
Greatest absolute difference: 1662.0 at index (0, 235) (up to 1.0 allowed)
Greatest relative difference: 1.0 at index (0, 0) (up to 0.01 allowed)
### Tasks
- [ ] INT8xINT4 Fast Decoding
- [ ] warp reduce API update

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions