NNCP: Lossless Data Compression with Neural Networks

NNCP is an experiment to build a practical lossless data compressor with neural networks. The latest version uses a Transformer model.

The papers nncp_v2.1.pdf and nncp.pdf describe the algorithms and results of previous releases of NNCP.

The current release of NNCP is implemented in C and uses LibNC to get better performance than PyTorch.

Compression ratio

Result for enwik8:

Program Compr. size
(bytes)
Ratio
(bpb)
gzip 36 445 2482.92
xz 24 865 2441.99
NNCP (2021-02-06)15 020 6911.20
CMIX (v18) 14 838 3321.19

Result for enwik9:

Program Compr. size
(bytes)
Ratio
(bpb)
Program size
(zip, bytes)
Total
(bytes)
gzip 322 591 995 2.5838 801322 630 796
xz 197 331 816 1.5836 752197 368 568
CMIX (v18) 115 714 367 0.926208 961115 923 328
NNCP (2021-04-24) 110 034 2930.880197 491110 231 784

* The results for the other programs are from the Large Text Compression Benchmark.

Download

Related Links


Fabrice Bellard - https://bellard.org/