JH-Toolkit v1.4.1
An engineering-oriented C++20 toolkit with duck-typed concepts, static design, async coroutines, and semantic containers — header-only, RTTI-free, and concurrency-friendly.
Loading...
Searching...
No Matches
huffman.h File Reference

High-performance Huffman encoder/decoder supporting both canonical and standard tree-based algorithms, with selectable symbol ranges (128 or 256). More...

#include <cstdint>
#include <iostream>
#include <vector>
#include <queue>
#include <stack>
#include <algorithm>
#include "jh/metax/t_str.h"
#include "jh/pods/array.h"
#include "jh/pods/pair.h"
#include "jh/pods/bytes_view.h"

Go to the source code of this file.

Classes

class  jh::serio::huffman< Signature, Algo >
 High-performance Huffman encoder/decoder. More...

Namespaces

namespace  jh::serio
 Aggregated entry point for serialization and codec utilities.

Enumerations

enum class  jh::serio::huff_algo : std::uint8_t { jh::serio::huff128 , jh::serio::huff256 , jh::serio::huff128_canonical , jh::serio::huff256_canonical }
 Enumeration of supported Huffman algorithm variants. More...

Detailed Description

High-performance Huffman encoder/decoder supporting both canonical and standard tree-based algorithms, with selectable symbol ranges (128 or 256).

Author
JeongHan-Bae <mastropseudo@gmail.com>

The jh::serio::huffman component provides four distinct compression algorithms:

  • huff128 — Standard Huffman for ASCII (0-127)
  • huff256 — Standard Huffman for full byte range (0-255)
  • huff128_canonical — Canonical Huffman (ASCII)
  • huff256_canonical — Canonical Huffman (full byte range)

Canonical algorithms generate deterministic codewords, enabling extremely fast decoding via table lookups without tree traversal. Normal Huffman algorithms preserve legacy tree behavior for compatibility with older formats.

Feature Comparison

Featurehuff128huff128_canonicalhuff256huff256_canonical
Symbol range0-1270-1270-2550-255
Encoding speed✔ Fast✔ Fast<br>(minor setup cost)✔ Fast✔ Fastest overall
Decoding speed✘ Slow<br>(tree traversal)✔✔ Very fast
(table-based)
✘ Slow<br>(tree traversal)✔✔ Very fast
(table-based)
Worst-case decode complexityO(depth × N)O(N)O(depth × N)O(N)
Codeword determinism✔ Canonical✔ Canonical
Best usageLegacy formats / debuggingHigh-performance ASCIIGeneral binary dataHigh-performance binary

Measured Performance

Apple Silicon M3, LLVM clang++ 20, input size 20k-200k bytes (2025).

Operationhuff128huff128_canonicalhuff256huff256_canonical
Debug (unoptimized)
Encode0.45-4.3 ms0.44-4.6 ms0.54-4.0 ms0.48-3.8 ms
Decode1.4-15 ms0.39-3.8 ms1.5-15 ms0.44-4.2 ms
Relative decode speed1× (baseline)≈ 3.5-4× faster1× (baseline)≈ 3.5-4× faster
O3 optimized
Encode0.155-1.51 ms0.154-1.49 ms0.156-1.52 ms0.156-1.48 ms
Decode0.406-2.27 ms0.290-2.17 ms0.265-2.27 ms0.260-2.16 ms
Relative decode speed1× (baseline)≈1.05-1.4× faster1× (baseline)≈1.02-1.05× faster

Usage Example

using HUF = jh::serio::huffman<"demo", huff_algo::huff256_canonical>;
std::ostringstream out_stream(std::ios::binary);
HUF::compress(out_stream, "hello world");
const std::string binary = out_stream.str();
std::istringstream in_stream(binary, std::ios::binary);
std::string recovered = HUF::decompress(in_stream);
std::cout << recovered << std::endl;
High-performance Huffman encoder/decoder.
Definition huffman.h:249
huff_algo
Enumeration of supported Huffman algorithm variants.
Definition huffman.h:218
Version
1.4.x
Date
2025