Skip to content

llama : fix attention layer count sanity check#6550

Merged
ggerganov merged 2 commits intomasterfrom
gg/quantize-mamba-assert
Apr 8, 2024
Merged

llama : fix attention layer count sanity check#6550
ggerganov merged 2 commits intomasterfrom
gg/quantize-mamba-assert

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Apr 8, 2024

@ggerganov ggerganov requested a review from compilade April 8, 2024 18:18
There was otherwise a warning when compiling.
Copy link
Collaborator

@compilade compilade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! From my tests it seems to work.

The assertion will need to be changed again with Jamba (because some (but not all) of its layers are attention layers), but this will be fixed later, when it will be relevant.

@ggerganov ggerganov merged commit cc4a954 into master Apr 8, 2024
@ggerganov ggerganov deleted the gg/quantize-mamba-assert branch April 8, 2024 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants