NNCP is an experiment to build a practical lossless data compressor
with neural networks. The latest version uses a Transformer
model.

The papers nncp_v2.1.pdf
and nncp.pdf describe the algorithms and
results of previous releases of NNCP.

The current release of NNCP is implemented in C and
uses LibNC to get better performance than
PyTorch.

Compression ratio

Result for enwik8:

Program Compr. size

(bytes)
Ratio

(bpb)
gzip 36 445 248 2.92
xz 24 865 244 1.99
NNCP (2021-06-01) 14 969 569 1.20
CMIX (v18) 14 838 332 1.19



Result for enwik9:

Program Compr. size

(bytes)
Ratio

(bpb)
Program size(zip, bytes) Total

(bytes)
gzip 322 591 995 2.58 38 801 322 630 796
xz 197 331 816 1.58 36 752 197 368 568
CMIX (v18) 115 714 367 0.926 208 961 115 923 328
NNCP (2021-06-01) 108 378 032 0.867 201 620 108 579 652

* The results for the other programs are from the Large Text Compression Benchmark.

Download

Related Links


Fabrice Bellard – https://bellard.org/

Read More