pixie is a succinct data structures library.
- BitVector
- Data structure with 3.61% overhead supporting rank and select for 1 bits.
- Supports:
rank(i): number of set bits (1s) up to positioni.select(k): position of thek-th set bit.- Similar operations
rank0/select0for0.
- Implementation mainly follows [1] with SIMD optimizations similar to [2]
- Optimized via AVX-512/AVX-2, for large binary sequences performance is I/O bounded.
- RmMTree
- Implementation of a range min-max tree, it supports
rank,selectandexcess-related operations allowing for a fast navigation in DFUDS/BP trees.
- Implementation of a range min-max tree, it supports
- C++20
- CMake ≥ 3.18.
git clone https://github.com/Malkovsky/pixie.git
cd pixie
cmake --preset release
cmake --build --preset releaseManual alternative:
mkdir -p build/release
cmake -B build/release -DCMAKE_BUILD_TYPE=Release
cmake --build build/release -jTests are enabled by default (PIXIE_TESTS=ON). Benchmarks are opt-in; enable with -DPIXIE_BENCHMARKS=ON or configure with the benchmarks-all preset, you can use benchmark-diagnostic preset for performance diagnostics (Release with debug info + performance counters support).
After building with presets, binaries are located in build/release.
./build/release/unittests./build/release/test_rmmConfigure a coverage build with GCC (benchmarks disabled):
cmake --preset coverage
cmake --build --preset coverageRun tests and generate the gcov text report:
./scripts/coverage_report.shBefore running benchmarks, configure with presets:
cmake --preset benchmarks-all
cmake --build --preset releaseFor a RelWithDebInfo diagnostic build, use:
cmake --preset benchmarks-diagnostic
cmake --build --preset releaseBenchmarks are random 50/50 0-1 bitvectors up to
./build/release/benchmarks./build/release/bench_rmmFor comparison with range min-max tree implementation from sdsl-lite (Release build required; use the release preset or -DCMAKE_BUILD_TYPE=Release):
sudo cpupower frequency-set --governor performance
./build/release/bench_rmm_sdsl --benchmark_out=rmm_bench_sdsl.jsonFor visualization, write the JSON output to a file using --benchmark_out=<file> (e.g. ./build/release/bench_rmm --benchmark_out=rmm_bench.json) and plot it with scripts/plot_rmm.py (add --sdsl-json rmm_bench_sdsl.json for comparison).
#include <pixie/bitvector.h>
#include <vector>
#include <iostream>
using namespace pixie;
int main() {
std::vector<uint64_t> bits = {0b101101}; // 6 bits
BitVector bv(bits, 6);
std::cout << "bv: " << bv.to_string() << "\n"; // "101101"
std::cout << "rank(4): " << bv.rank(4) << "\n"; // number of ones in first 4 bits
std::cout << "select(2): " << bv.select(2) << "\n"; // position of 2nd one-bit
}#include <pixie/rmm_tree.h>
#include <string>
#include <iostream>
using namespace pixie;
int main() {
// root
// ├─ A
// │ ├─ a1
// │ └─ a2
// ├─ B
// └─ C
// └─ c1
std::string bits = "11101001011000";
RmMTree t(bits);
std::cout << "close(1): " << t.close(1) << "\n"; // expected 6 (A)
std::cout << "open(3): " << t.open(3) << "\n"; // expected 2 (a1)
std::cout << "enclose(1): " << t.enclose(1) << "\n"; // expected 0 (root)
}-
[1] Laws et al., SPIDER: Improved Succinct Rank and Select Performance SPIDER
-
[2] Kurpicz, Engineering compact data structures for rank and select queries on bit vectors pasta-toolbox/bit_vector
