NNCP: Lossless Data Compression with Neural Networks

NNCP is an experiment to build a practical lossless data compressor with neural networks. The latest version uses a Transformer model.

The papers nncp_v2.1.pdf and nncp.pdf describe the algorithms and results of previous releases of NNCP.

The current release of NNCP is implemented in C and uses LibNC to get better performance than PyTorch.

Compression ratio

Result for enwik8:

Result for enwik9:

Program	Compr. size (bytes)	Ratio (bpb)	Program size (zip, bytes)	Total (bytes)
gzip	322 591 995	2.58	38 801	322 630 796
xz	197 331 816	1.58	36 752	197 368 568
CMIX (v19)	111,470,932	0.892	223 485	111 694 417
NNCP (2023-10-21)	106 632 363	0.853	628 955	107 261 318

* The results for the other programs are from the Large Text Compression Benchmark.

NNCP v3.3: Linux version (including CUDA support): nncp-2024-06-05.tar.gz (Changelog, readme.txt).
NNCP v3.3: Precompiled Windows version (including CUDA support): nncp-2024-06-05-win64.zip.
NNCP v2 (Python+PyTorch, GPU required): nncp_v2-2021-02-06-1.tar.gz