Contiguous AVL-based ordered container with fragmentation-free semantics. More...

#include <concepts>
#include <cstddef>
#include <utility>
#include <cstdint>
#include <iostream>
#include <type_traits>
#include <iterator>
#include <stdexcept>
#include <ranges>
#include <vector>
#include "jh/typing/monostate.h"
#include "jh/conceptual/tuple_like.h"

Classes
struct	jh::avl::avl_node< K, V >
	Node element for the contiguous AVL tree. More...
struct	jh::avl::avl_node< K, jh::typed::monostate >
	Node element for set-style contiguous AVL trees. More...
class	jh::avl::tree_map< K, V, Alloc >
	Contiguous-array AVL tree used by `jh::ordered_map` and `jh::ordered_set`. More...
struct	jh::avl::tree_map< K, V, Alloc >::iterator
	Bidirectional iterator providing in-order traversal. More...
struct	jh::avl::tree_map< K, V, Alloc >::const_iterator
	Const bidirectional iterator for in-order traversal. More...

Namespaces
namespace	jh::avl
	Internal AVL-tree implementation namespace for `jh::ordered_set` and `jh::ordered_map`.

Typedefs
template<typename K, typename Alloc = std::allocator<K>>
using	jh::ordered_set
	Ordered associative set based on a contiguous-array AVL tree.
template<typename K, typename V, typename Alloc = std::allocator<std::pair<const K, V>>>
using	jh::ordered_map
	Ordered associative map based on a contiguous-array AVL tree.

Detailed Description

Contiguous AVL-based ordered container with fragmentation-free semantics.

Copyright: Copyright 2025 JeongHan-Bae <mastropseudo@gmail.com>
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Full license: GitHub

Author: JeongHan-Bae <mastropseudo@gmail.com>

Overview

jh::ordered_set<K> and jh::ordered_map<K,V> implement a contiguous AVL tree stored inside a std::vector (or PMR-aware vector), eliminating pointer chasing and reducing allocator fragmentation. Nodes are referred to by indices instead of pointers, enabling stable, cache-friendly traversal and compact memory layout.

These containers are not intended as a full replacement for std::set or std::map, but serve as a predictable, fragmentation-free, and locality-optimized alternative for workloads where:

memory fragmentation must be controlled,
allocator churn is expensive or unstable,
iterators need predictable traversal cost,
large-scale workloads favor cache locality,
PMR-based pooling is desired for O(1) mass-clear operations.

Design Goals

Minimize fragmentation by storing all nodes in a contiguous vector.
Provide stable, predictable erasure via compactification.
Offer STL-like API semantics (find, lower_bound, iteration, erase).
Exploit cache locality by avoiding pointer-heavy RB-tree structures.
Ensure deterministic clear() behavior under PMR.

Internal Storage Model

Nodes are stored as AVL nodes in a contiguous vector:

struct node {
    store_t stored_; // std::pair<K, V> for map or K for set
    size_t parent, left, right;
    uint16_t height;
};

No node is heap-allocated individually.
Index references remain valid except for the erased node itself.
Erase compacts the last node into the erased node's slot.
AVL rotation works on indices instead of pointers.

Comparison vs `std::set` / `std::map`

Aspect	`std::set` `std::map`	`jh::ordered_set` `ordered_map`
Node layout	pointer-linked RB-tree	contiguous AVL (vector)
Fragmentation	high (many small allocations)	minimal (1 vector buffer)
Iterator stability	stable	stable except erased node
Erase cost	O(log N)	O(log N) + compactification
Traversal locality	poor (pointer chasing)	excellent
PMR `clear()`	deep node destruction	O(1), vector reset
For >5k elements	stable but noisy	predictable, low jitter

Rationale — Why It Exists

Modern allocators suffer from fragmentation under workloads that frequently construct and destroy many tree nodes (e.g., dynamic indexing, routing tables, message subscription graphs). The standard containers rely on one allocation per node, causing:

allocator churn,
TLB pressure and cache misses,
unpredictable latency spikes,
free-list poisoning in long-running systems.

The contiguous AVL model eliminates these issues by placing all nodes into a single dynamic buffer. Erasing a node does not free any memory; it simply moves the last node into the removed slot. Combined with a monotonic-buffer resource, clear() becomes nearly O(1).

About Performance

Benchmarks (Apple M3, LLVM clang++20, 2025) show stable behavior across 5 000-1 000 000 elements:

Operation	`std::set/map`	`jh::ordered_set/map`	Notes
Random insert	fast start, large jitter	≈ 10-40% overhead but small jitter	AVL maintenance but contiguous memory
Ordered insert	degenerates (right-heavy RB)	consistently faster	vector locality dominates
Random lookup	stable	comparable or slightly faster	branchless traversal & locality
Iteration	pointer chasing	≈ 15-30% faster	sequential memory
Erase	stable	slightly slower worst-case	compacting cost
Clear (PMR)	O(N) destruct	O(1)	pool discard

Observed Behavior in Large Datasets

For 100k-1M string keys, performance gap tightens to within ~10%.
For fully ordered input, ordered_set often surpasses std::set.
Lookup variance is consistently lower due to contiguous cachelines.
Iteration is measurably faster at all scales.

Memory & Fragmentation Notes

No per-node allocation → extremely low fragmentation.
erase() never frees memory.
clear() under PMR is almost zero-cost.
Ideal for systems where pointers must not be invalidated by allocators.
Much more stable than pointer-based trees under long uptimes.

Limitations

Iterators are invalidated by erase() except the returned one.
Does not provide node-hint insertion APIs (unlike std::map::insert(hint)).
Erase requires compactification (copy/move of last node).
Not designed for persistent node references.

Use Cases

Memory-fragmentation-sensitive systems (game engines, GUI trees, routing).
Real-time components requiring predictable latency.
Systems with massive clear/repoulate cycles using PMR.
Large ordered indexes requiring sequential iteration.

Complexity Summary

Insert: O(log N)
Erase: O(log N) + O(1) compact
Find: O(log N)
Traversal: O(N), cache-friendly

Version

1.4.x

Date

Classes

Namespaces

Typedefs