NNCP: Lossless Data Compression with Neural Networks

NNCP is an experiment to build a practical lossless data compressor with neural networks. The best performer uses an LSTM model. A model based on self-attention (Transformer) is also evaluated.

The algorithms and results are described in this paper.

NNCP is based on the LibNC library which allows fast and deterministic evaluation and training of neural networks on x86 CPUs. It is optimized for small batch sizes and low latency. LibNC has no dependency on other libraries and has a C API.

Compression ratio

Result for enwik8:

Program Compr. size
(bytes)
Ratio
(bpb)
gzip 36 445 2482.92
xz 24 865 2441.99
NNCP16 924 5691.35
CMIX (v17) 14 877 3731.19

Result for enwik9:

Program Compr. size
(bytes)
Ratio
(bpb)
gzip 322 591 995 2.58
xz 197 331 816 1.58
NNCP 128 292 351 1.03
CMIX (v17) 116 394 271 0.93

Download

Linux version: nncp-2019-04-13.tar.gz. LibNC is currently only provided as object code.

Precompiled Windows version: nncp-2019-04-13-win64.zip.

Related Links


Fabrice Bellard - https://bellard.org/