JH-Toolkit v1.4.1
An engineering-oriented C++20 toolkit with duck-typed concepts, static design, async coroutines, and semantic containers — header-only, RTTI-free, and concurrency-friendly.
Loading...
Searching...
No Matches
ordered_map.h File Reference

Contiguous AVL-based ordered container with fragmentation-free semantics. More...

#include <concepts>
#include <cstddef>
#include <utility>
#include <cstdint>
#include <iostream>
#include <type_traits>
#include <iterator>
#include <stdexcept>
#include <ranges>
#include <vector>
#include "jh/typing/monostate.h"
#include "jh/conceptual/tuple_like.h"

Go to the source code of this file.

Classes

struct  jh::avl::avl_node< K, V >
 Node element for the contiguous AVL tree. More...
struct  jh::avl::avl_node< K, jh::typed::monostate >
 Node element for set-style contiguous AVL trees. More...
class  jh::avl::tree_map< K, V, Alloc >
 Contiguous-array AVL tree used by jh::ordered_map and jh::ordered_set. More...
struct  jh::avl::tree_map< K, V, Alloc >::iterator
 Bidirectional iterator providing in-order traversal. More...
struct  jh::avl::tree_map< K, V, Alloc >::const_iterator
 Const bidirectional iterator for in-order traversal. More...

Namespaces

namespace  jh::avl
 Internal AVL-tree implementation namespace for jh::ordered_set and jh::ordered_map.

Typedefs

template<typename K, typename Alloc = std::allocator<K>>
using jh::ordered_set
 Ordered associative set based on a contiguous-array AVL tree.
template<typename K, typename V, typename Alloc = std::allocator<std::pair<const K, V>>>
using jh::ordered_map
 Ordered associative map based on a contiguous-array AVL tree.

Detailed Description

Contiguous AVL-based ordered container with fragmentation-free semantics.

Author
JeongHan-Bae <mastropseudo@gmail.com>

Overview

jh::ordered_set<K> and jh::ordered_map<K,V> implement a contiguous AVL tree stored inside a std::vector (or PMR-aware vector), eliminating pointer chasing and reducing allocator fragmentation. Nodes are referred to by indices instead of pointers, enabling stable, cache-friendly traversal and compact memory layout.

These containers are not intended as a full replacement for std::set or std::map, but serve as a predictable, fragmentation-free, and locality-optimized alternative for workloads where:

  • memory fragmentation must be controlled,
  • allocator churn is expensive or unstable,
  • iterators need predictable traversal cost,
  • large-scale workloads favor cache locality,
  • PMR-based pooling is desired for O(1) mass-clear operations.

Design Goals

  • Minimize fragmentation by storing all nodes in a contiguous vector.
  • Provide stable, predictable erasure via compactification.
  • Offer STL-like API semantics (find, lower_bound, iteration, erase).
  • Exploit cache locality by avoiding pointer-heavy RB-tree structures.
  • Ensure deterministic clear() behavior under PMR.

Internal Storage Model

Nodes are stored as AVL nodes in a contiguous vector:

struct node {
store_t stored_; // std::pair<K, V> for map or K for set
size_t parent, left, right;
uint16_t height;
};
  • No node is heap-allocated individually.
  • Index references remain valid except for the erased node itself.
  • Erase compacts the last node into the erased node's slot.
  • AVL rotation works on indices instead of pointers.

Comparison vs std::set / std::map

Aspectstd::set std::mapjh::ordered_set ordered_map
Node layoutpointer-linked RB-treecontiguous AVL (vector)
Fragmentationhigh (many small allocations)minimal (1 vector buffer)
Iterator stabilitystablestable except erased node
Erase costO(log N)O(log N) + compactification
Traversal localitypoor (pointer chasing)excellent
PMR clear()deep node destructionO(1), vector reset
For >5k elementsstable but noisypredictable, low jitter

Rationale — Why It Exists

Modern allocators suffer from fragmentation under workloads that frequently construct and destroy many tree nodes (e.g., dynamic indexing, routing tables, message subscription graphs). The standard containers rely on one allocation per node, causing:

  • allocator churn,
  • TLB pressure and cache misses,
  • unpredictable latency spikes,
  • free-list poisoning in long-running systems.

The contiguous AVL model eliminates these issues by placing all nodes into a single dynamic buffer. Erasing a node does not free any memory; it simply moves the last node into the removed slot. Combined with a monotonic-buffer resource, clear() becomes nearly O(1).

About Performance

Benchmarks (Apple M3, LLVM clang++20, 2025) show stable behavior across 5 000-1 000 000 elements:

Operationstd::set/mapjh::ordered_set/mapNotes
Random insertfast start, large jitter≈ 10-40% overhead but small jitterAVL maintenance but contiguous memory
Ordered insertdegenerates (right-heavy RB)consistently fastervector locality dominates
Random lookupstablecomparable or slightly fasterbranchless traversal & locality
Iterationpointer chasing≈ 15-30% fastersequential memory
Erasestableslightly slower worst-casecompacting cost
Clear (PMR)O(N) destructO(1)pool discard

Observed Behavior in Large Datasets

  • For 100k-1M string keys, performance gap tightens to within ~10%.
  • For fully ordered input, ordered_set often surpasses std::set.
  • Lookup variance is consistently lower due to contiguous cachelines.
  • Iteration is measurably faster at all scales.

Memory & Fragmentation Notes

  • No per-node allocation → extremely low fragmentation.
  • erase() never frees memory.
  • clear() under PMR is almost zero-cost.
  • Ideal for systems where pointers must not be invalidated by allocators.
  • Much more stable than pointer-based trees under long uptimes.

Limitations

  • Iterators are invalidated by erase() except the returned one.
  • Does not provide node-hint insertion APIs (unlike std::map::insert(hint)).
  • Erase requires compactification (copy/move of last node).
  • Not designed for persistent node references.

Use Cases

  • Memory-fragmentation-sensitive systems (game engines, GUI trees, routing).
  • Real-time components requiring predictable latency.
  • Systems with massive clear/repoulate cycles using PMR.
  • Large ordered indexes requiring sequential iteration.

Complexity Summary

  • Insert: O(log N)
  • Erase: O(log N) + O(1) compact
  • Find: O(log N)
  • Traversal: O(N), cache-friendly
Version
1.4.x
Date
2025