Skip to content

Commit 29dce4a

Browse files
committed
dequant
1 parent dfc80ee commit 29dce4a

8 files changed

Lines changed: 386 additions & 194 deletions

File tree

README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,19 @@ Optional CUDA/ROCm backends can be enabled with `-DUSE_CUDA=ON` / `-DUSE_ROCM=ON
108108

109109
`t81-convert`, `t81-gguf`, and `t81-qat` automate quantize→export→train flows with progress reporting and validation hooks. Browse [docs/references/cli-usage.md](docs/references/cli-usage.md), [docs/diagrams/cli-workflows-mermaid.md](docs/diagrams/cli-workflows-mermaid.md), and [examples/cli-examples.md](examples/cli-examples.md) for recipes.
110110

111+
### Dequantizing for downstream runtimes
112+
113+
Use the new `t81-dequant` helper (backed by `t81.dequantize_gguf_to_float`) to rewrite a TQ1_0/TQ2_0 bundle into float32 before handing it to stock llama.cpp, Ollama, or LM Studio builds that lack ternary support:
114+
115+
```bash
116+
t81-dequant model-tq1.gguf model-compatible-f16.gguf
117+
```
118+
119+
That command rewrites the tensors in place while preserving the standard GGUF metadata so the resulting file works with existing loaders. Keep the original `model-tq1.gguf` around for runtimes that already understand TQ tensors, and only run `t81-dequant` when you need immediate compatibility.
120+
121+
For a zero-disk workaround you can also dequantize on the fly (via `t81.dequantize_gguf_to_float` or a small loader patch) before instantiating `llama_cpp.Llama`; see the docs for an example monkey patch if you want to load `model-tq1.gguf` directly without producing an intermediate copy.
122+
123+
111124
## GGUF v4 compliance
112125

113126
t81’s GGUF exports already mirror the llama.cpp conventions; v4’s mandatory `gguf_header` additions are worth calling out for everybody writing their own converter:

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,10 @@ torch = [
3333
"accelerate>=0.20",
3434
"datasets>=2.13",
3535
]
36-
dev = ["pytest", "pybind11>=2.12", "cibuildwheel>=2.15"]
36+
dev = ["pytest>=9.0", "pybind11>=2.12", "cibuildwheel>=2.15"]
3737

3838
[project.scripts]
3939
t81-convert = "t81.convert:main"
4040
t81-qat = "t81.scripts.t81_qat:main"
4141
t81-gguf = "t81.scripts.t81_gguf:main"
42+
t81-dequant = "t81.scripts.t81_dequant:main"

python/bindings.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,9 @@
3030

3131
namespace py = pybind11;
3232
namespace core = t81::core;
33+
using t81::DeviceType;
34+
using t81::ScalarType;
35+
using t81::TensorMetadata;
3336

3437
static std::string
3538
decimal_string(const core::bigint &value) {

src/llama.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Subproject commit 482211438dd671224a7f176b7480b4ded424212c

t81/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
"gguf",
1919
"read_gguf",
2020
"write_gguf",
21+
"dequantize_gguf_to_float",
2122
"convert",
2223
"Linear",
2324
"ternary",
@@ -34,6 +35,7 @@
3435
_LAZY_MEMBERS: dict[str, tuple[str, str]] = {
3536
"read_gguf": (".gguf", "read_gguf"),
3637
"write_gguf": (".gguf", "write_gguf"),
38+
"dequantize_gguf_to_float": (".gguf", "dequantize_gguf_to_float"),
3739
"convert": (".convert", "convert"),
3840
"Linear": (".nn", "Linear"),
3941
"ternary": (".qat", "ternary"),

0 commit comments

Comments
 (0)