Skip to content

Tags: prosyslab-classroom/llama.cpp

Tags

b3639

vulkan : fix build (#0)

ggml-ci

b3417

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
convert-*.py: add general.name kv override (ggml-org#8571)

b3281

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
convert-hf : print output file name when completed (ggml-org#8181)

* convert-hf : print output file name when completed

This commit adds the output file name to the log message when the
conversion is completed.

The motivation for this change is that when `--outfile` option is not
specified it migth not be obvious where the output file is written.

With this change the output of running the script will be something like
the following:
```console
INFO:hf-to-gguf:Model successfully exported to models/gemma-2-9b-it.gguf.
```

Signed-off-by: Daniel Bevenius <[email protected]>

* squash! convert-hf : print output file name when completed

Updates the output of to support printing the directory if the output is
split into multiple files. Also the output file name is now retrieved
from the model_instance object.

Signed-off-by: Daniel Bevenius <[email protected]>

* squash! convert-hf : print output file name when completed

Use parent attribute of Path object and string interpolation.

Signed-off-by: Daniel Bevenius <[email protected]>

* squash! convert-hf : print output file name when completed

Use os.sep instead of hardcoding the path separator.

Signed-off-by: Daniel Bevenius <[email protected]>

---------

Signed-off-by: Daniel Bevenius <[email protected]>

b3218

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: fix matrix multiplication algorithm choice (ggml-org#8102)

b3214

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: optimize MMQ int8 tensor core performance (ggml-org#8062)

* CUDA: optimize MMQ int8 tensor core performance

* only a single get_mma_tile_x_k function

* simplify code, make functions constexpr

b3189

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[SYCL] Fix windows build and inference (ggml-org#8003)

* add sycl preset

* fix debug link error. fix windows crash

* update README

b3080

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Per token attributes (ggml-org#7685)

* Add per token attributes enum
* Using phi-3 for testing 'rstrip'
* Using jina-v2 for testing 'lstrip'
* Brute force test for 'lstrip' and 'rstrip'
* Implement 'rstrip' and 'lstrip'
* Update phi-3 GGUF file (obsolete since 917dc8c)
* Replace llama_token_type with llama_token_attribs