CPB

Cocoa Powder Bottle
0.0%
Best Ratio (JSON)
12,435
MB/s Encode
22,687
MB/s Decode
0.4%
L5 LEARN 2nd
GitHub Benchmarks Example Pipeline

What is CPB?

A multi-layer archive format that compresses, protects, and disguises your data.

šŸ“–

Structure Dict

L3 Genre DSL — auto-detects data type, replaces patterns with compact references.

šŸ”¤

String Dict

L5 GenDict — caches exact matches. 2nd pass compresses to ~270 bytes.

šŸ›”ļø

Error Recovery

Reed-Solomon codes. Auto-repair corruption without re-download.

šŸŽ¬

Video Container

Output as AVI / MP4 / ZIP / PDF / PNG. Standard tools work natively.

šŸ”€

4D–16D Shuffle

Multi-dimensional byte rearrangement. Up to 16 independent axes.

šŸ”

Full-Text Search

FIDX index — search inside archives without extraction.


CPB vs Existing Tools

64 KB JSON — single-threaded benchmark

gzip
ratio: 34%
enc: 150 MB/s
dec: 400 MB/s
baseline
zstd
ratio: 28%
enc: 380 MB/s
dec: 1,100 MB/s
fastest modern
LZMA/7z
ratio: 22%
enc: 18 MB/s
dec: 120 MB/s
best ratio
zlib
ratio: 38%
enc: 140 MB/s
dec: 350 MB/s
ubiquitous
CPB STD
ratio: 0.8%
enc: 4.2 MB/s
dec: 891 MB/s
learn+dict
CPB L5Ɨ2
ratio: 0.4%
enc: 147 MB/s
dec: 1,857 MB/s
cache hit!

Compression Ratio — lower is better (JSON 64 KB)

gzip
34 %
zlib
38 %
LZMA/7z
22 %
zstd
28 %
CPB STANDARD
0.8 %
CPB L5Ɨ2
0.4 %

Real-World Example

2,000-entry JSON access log — 401 KB input, round-trip verified āœ“

{ "timestamp": "2026-01-01T00:00:00Z", "level": "INFO", "service": "api", "request_id": "req-100000", "user_id": 5506, "path": "/api/v1/orders", "status": 200, "latency_ms": 72, "message": "Request processed successfully" } // Ɨ 2,000 entries
MethodOutputRatioNotes
gzip (ā‰ˆL2)65 KB15.9%baseline
CPB STANDARD74 KB18.2%full pipeline, no dict
CPB + L3 dict65 KB15.9%phrase dict on same data
CPB LEARN pass 165 KB15.9%first time — cache built
CPB LEARN pass 2270 B0.07%cache hit

When CPB wins

Repeated log archival — L5 cache → near-zero size
Structured telemetry — L3 dict kills repeated keys
Sensitive data — RS + L4 shuffle for protection

When CPB loses

Random / encrypted data — no redundancy (~115%)
One-shot compression — L5 cache cold, no edge
Tiny files (<1 KB) — header overhead dominates

Pipeline

Up to 6 layers. Each independently toggleable. 7 preset profiles.

šŸ“
Input Data
Any file or folder
↓
šŸ“–
L5 GenDict
Exact-match cache — 0.07% on 2nd pass
↓
šŸ”¤
L3 Genre DSL
Structure-aware phrase replacement
↓
šŸ“¦
L2 Compress
LZMS / MSZIP / XPRESS — 0.6% ratio
↓
šŸ›”ļø
L1 Protect
Reed-Solomon error correction
↓
šŸ”€
L4 Shuffle
4D–16D multi-dimensional rearrangement
↓
šŸŽ¬
Container
.cpb / .zip / .mp4 / .pdf / .png

Detailed Benchmarks

Linux x86-64 / g++ 13 / -O2 / C++17 / single-threaded / best of 5 runs

L2 Algorithm — Encode MB/s

AUTO
631 MB/s
LZMS
895 MB/s
MSZIP
895 MB/s
XPRESS_HUFF
892 MB/s
XPRESS
892 MB/s
RLE
240 MB/s
DELTA
293 MB/s
NONE
12,435 MB/s

L1 RS Protection — Encode MB/s

STANDARD
17.6 MB/s
MAX
4.5 MB/s
STEGO
1.1 MB/s
LIGHT
17.7 MB/s
NONE
14,282 MB/s

L4 Dimensions — Encode Speed

4D
24 MB/s
8D
12.3 MB/s
12D
8.1 MB/s
16D
6.1 MB/s

Pipeline — Compression Ratio (64 KB JSON)

ConfigurationRatioNote
L2 only0.6%fastest
L3 + L20.8%
STANDARD (4D)0.8%default
STANDARD + L31.2%
STANDARD + 16D0.8%
STANDARD + L3 + 16D1.2%
ARCHIVE114.4%expands (protection only)

Data Type Sensitivity (STANDARD pipeline)

DataRatio
JSON 1 KB26.4%
JSON 64 KB0.8%
JSON 1 MB0.7%
Random 64 KB115.2%
Random 1 MB114.8%
All-zeros 1 MB0.0%

L5 LEARN Cache + Carrier Overhead

L5 PassRatioDecode MB/s
pass 1 (cold)1.2%855
pass 2 (hit)0.4%1,857
pass 30.4%2,531
CarrierOverhead
CPB (.cpb)+0 B
MP4 (.mp4)+56 B
PNG (.png)+93 B
ZIP (.zip)+224 B
PDF (.pdf)+883 B

Roadmap

 

v0.1 — Core Pipeline

L2 compression, L1 Reed-Solomon, L4 shuffle, 5 carrier formats. 654 tests.

v0.2 — Dictionary System

L3 Genre DSL, L5 GenDict cache, dictionary training, FIDX search.

v1.0 — Public Release

Open source, documentation, benchmarks, GitHub Pages.

v1.1 — SDK

Embeddable library API. Client-to-Client data foundation.

v2.0 — Advanced

Hilbert curve, differential backup, multi-frame random access.