|
JH-Toolkit v1.4.1
An engineering-oriented C++20 toolkit with duck-typed concepts, static design, async coroutines, and semantic containers — header-only, RTTI-free, and concurrency-friendly.
|
High-performance Huffman encoder/decoder supporting both canonical and standard tree-based algorithms, with selectable symbol ranges (128 or 256). More...
#include <cstdint>#include <iostream>#include <vector>#include <queue>#include <stack>#include <algorithm>#include "jh/metax/t_str.h"#include "jh/pods/array.h"#include "jh/pods/pair.h"#include "jh/pods/bytes_view.h"Go to the source code of this file.
Classes | |
| class | jh::serio::huffman< Signature, Algo > |
| High-performance Huffman encoder/decoder. More... | |
Namespaces | |
| namespace | jh::serio |
| Aggregated entry point for serialization and codec utilities. | |
Enumerations | |
| enum class | jh::serio::huff_algo : std::uint8_t { jh::serio::huff128 , jh::serio::huff256 , jh::serio::huff128_canonical , jh::serio::huff256_canonical } |
| Enumeration of supported Huffman algorithm variants. More... | |
High-performance Huffman encoder/decoder supporting both canonical and standard tree-based algorithms, with selectable symbol ranges (128 or 256).
The jh::serio::huffman component provides four distinct compression algorithms:
huff128 — Standard Huffman for ASCII (0-127) huff256 — Standard Huffman for full byte range (0-255) huff128_canonical — Canonical Huffman (ASCII) huff256_canonical — Canonical Huffman (full byte range) Canonical algorithms generate deterministic codewords, enabling extremely fast decoding via table lookups without tree traversal. Normal Huffman algorithms preserve legacy tree behavior for compatibility with older formats.
| Feature | huff128 | huff128_canonical | huff256 | huff256_canonical |
|---|---|---|---|---|
| Symbol range | 0-127 | 0-127 | 0-255 | 0-255 |
| Encoding speed | ✔ Fast | ✔ Fast<br>(minor setup cost) | ✔ Fast | ✔ Fastest overall |
| Decoding speed | ✘ Slow<br>(tree traversal) | ✔✔ Very fast (table-based) | ✘ Slow<br>(tree traversal) | ✔✔ Very fast (table-based) |
| Worst-case decode complexity | O(depth × N) | O(N) | O(depth × N) | O(N) |
| Codeword determinism | ✘ | ✔ Canonical | ✘ | ✔ Canonical |
| Best usage | Legacy formats / debugging | High-performance ASCII | General binary data | High-performance binary |
Apple Silicon M3, LLVM clang++ 20, input size 20k-200k bytes (2025).
| Operation | huff128 | huff128_canonical | huff256 | huff256_canonical |
|---|---|---|---|---|
| Debug (unoptimized) | ||||
| Encode | 0.45-4.3 ms | 0.44-4.6 ms | 0.54-4.0 ms | 0.48-3.8 ms |
| Decode | 1.4-15 ms | 0.39-3.8 ms | 1.5-15 ms | 0.44-4.2 ms |
| Relative decode speed | 1× (baseline) | ≈ 3.5-4× faster | 1× (baseline) | ≈ 3.5-4× faster |
| O3 optimized | ||||
| Encode | 0.155-1.51 ms | 0.154-1.49 ms | 0.156-1.52 ms | 0.156-1.48 ms |
| Decode | 0.406-2.27 ms | 0.290-2.17 ms | 0.265-2.27 ms | 0.260-2.16 ms |
| Relative decode speed | 1× (baseline) | ≈1.05-1.4× faster | 1× (baseline) | ≈1.02-1.05× faster |
1.4.x
2025