Skip to content

Releases: RWKV/rwkv.cpp

master-8db73b1

19 Sep 14:46
8db73b1

Choose a tag to compare

Update ggml (#128)

* Fix quantize.py doc

* Add Q5 format compatibility test

* Update ggml

* Add documentation about limitations of sequence mode

* Fix most compiler warnings

* Clean up CMakeLists.txt

* Assert contiguity instead of assuming it

* Update README.md

* Fix warnings

* Try to fix compilation error

* Attempt to fix Ubuntu build

* Attempt to fix Ubuntu build

* Restore all build jobs

* Allow sequence lengths of up to 64 out of the box by forking ggml

master-d6c691e

09 Sep 07:14
d6c691e

Choose a tag to compare

add other language bindings (#126)

* add other language bindings

* Update README.md

---------

Co-authored-by: Alex <[email protected]>

master-2d3cdd7

20 Aug 06:31
2d3cdd7

Choose a tag to compare

only append to cpu string if not initialized (#125)

* only append to cpu string if not initialized

* Fix code style

---------

Co-authored-by: Alex <[email protected]>

master-84f34c5

21 Jul 13:37
84f34c5

Choose a tag to compare

Implement basic CLBlast support (#110)

* Get this thing building

Unzip the OpenCL SDK and CLBlast distribution into the repo root,
then enable RWKV_CLBLAST and regenerate makefiles to pick them up.

Currently builds and runs.

* Really offload tensors to OpenCL rather than cuBLAS

* Fix CLBlast builds in CMake release mode

Somehow the path handling is different here which requires me to
be quite a bit more annoying about it.

* Remove `brew update`

* Try building without sanitizer (maybe it would work this time?)

---------

Co-authored-by: saharNooby <[email protected]>

master-f685aa4

19 Jul 09:36
f685aa4

Choose a tag to compare

Fix "'NoneType' object has no attribute 'cast'" error when model is f…

master-25ee75e

18 Jul 09:39
25ee75e

Choose a tag to compare

Expose n_vocab, n_embed, n_layer to the Python interface (#118)

master-84634c0

27 Jun 09:29
84634c0

Choose a tag to compare

Elide logits if the logits pointer parameter is NULL (#107)

* Completely skip calculation of logits if nobody cares

This speeds up sequence mode evaluations by up to 20% if you ingest
a large prompt and then only retrieve the logits at the very end.

Note that you must pass a NULL pointer to the logits parameter in
order to take advantage of this optimization.

* logits_out=NULL documentation

master-ffc085c

26 Jun 11:24
ffc085c

Choose a tag to compare

Update GGML (#103)

* Update GGML

* Fix linux build

Of course we forgot why we did this, and broke the build again, in
the exact same way, a second time.

* Fix cuBLAS

Properly set the backend and then call ggml_cuda_transform_tensor

* Rename xx to x_prev

probably should slip this in now before we forget it's a thing.

* See how easy updates are now? (update GGML)

master-9cbb9d9

21 Jun 16:13
9cbb9d9

Choose a tag to compare

Various improvements (#104)

* Make rwkv_gpu_offload_layers return true only if layers were actually offloaded

* Validate device of tensors

* Offload all layers during test

* Consistently use FP16 and FP32 instead of float16/fp16/F16/etc.

* Use spaces for indentation

* Remove spaces between type name and []

* Add cuBLAS on Windows guide, refactor docs structure

* Insert replacement characters when decoding invalid UTF-8 sequences

* Fix compatibility

* Fix formatting

* Fix copy-pasted tensor validation

master-6b26e0d

15 Jun 11:17
6b26e0d

Choose a tag to compare

Add Python support for sequence mode (#101)