gguf-py : fix and simplify quantized shape round-trip by compilade · Pull Request #7483 · ggml-org/llama.cpp

compilade · 2024-05-23T04:06:50Z

#7234 has broken quantized tensor copy in gguf-new-metadata.py. (thanks @CISC for finding this! ref: #7234 (comment))

This was originally reported for IQ4_NL, but I think it affects all quantized tensor types.

(Converted models are fine, no worries. This fixes a crash of gguf-new-metadata.py)

Summary of changes

add quant_shape_from_byte_shape and quant_shape_to_byte_shape to convert between shapes
GGUFReader reshapes the Numpy array in each ReaderTensor to the shape GGUFWriter expects to receive
The shape of the ReaderTensors are left unchanged to avoid changing the behavior of gguf-dump.py

Testing

Q8_0
- @compilade I've tested a round-trip of a Q8_0 bloom model when adding general.description with gguf-new-metadata.py, then removing it, and the resulting model file has the same checksum as the original model.
IQ4_NL
- @compilade Again, a round-trip of bloom, but this time with IQ4_NL. The checksums match.

compilade · 2024-05-23T04:14:38Z

gguf-py/gguf/gguf_reader.py

            tensor_names.add(tensor_name)
            ggml_type = GGMLQuantizationType(raw_dtype[0])
            n_elems = int(np.prod(dims))
+            np_dims = tuple(reversed(dims.tolist()))


.tolist() is necessary here to avoid an error when reshaping afterwards, something about the shape being of type np.float64 for some reason.

gguf-py : fix and simplify quantized shape round-trip

2ff601f

compilade added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix python python script changes labels May 23, 2024

compilade mentioned this pull request May 23, 2024

convert-hf : support direct Q8_0 conversion #7234

Merged

13 tasks

gguf-py : remove unused import

c5fe1d6

compilade commented May 23, 2024

View reviewed changes

ggerganov approved these changes May 23, 2024

View reviewed changes

compilade added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label May 23, 2024

mofosyne merged commit b83bab1 into master May 25, 2024

compilade mentioned this pull request Jul 27, 2024

Embed files #8121

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf-py : fix and simplify quantized shape round-trip#7483

gguf-py : fix and simplify quantized shape round-trip#7483
mofosyne merged 2 commits intomasterfrom
compilade/gguf-py-fix-q-shape

compilade commented May 23, 2024 •

edited

Loading

Uh oh!

compilade May 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

compilade commented May 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes

Testing

Uh oh!

compilade May 23, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

compilade commented May 23, 2024 •

edited

Loading