TachusPI Documentation

TPI - Pi computation program

Copyright (c) 2009-2010 Fabrice Bellard

0) Introduction

This program computes decimal or binary digits of Pi. For really large
number of digits, the hard disk can be used to store temporary data. 

It can be a good stress test for the CPU, the RAM and the hard
disk(s). Do not use this program if your computer has a bad cooling !

This version works for Linux 64 bit. There is no plan to release a 32
bit version because it would be much slower than the 64 bit
version. The executable was tested on a Fedora 10 distribution but
should work on most recent distributions.

A Windows 64 bit version is available too. Currently it does not
support asynchronous and raw I/Os, so its performance is lower than
the Linux version for disk based computations.

The CPU must support the SSE3 instruction set, which is the case for
all recent CPUs as the year of 2010.

TPI is a short name for "Tachus PI". "Tachus" (pronounced "tacus")
means "fast" in ancient Greek.

1) Quick Start

1.1) Memory based computations


 ./tpi -o pi.txt 10M

to compute 10^7 digits of Pi. A single core is used. 

If your CPU has 4 physical cores, type:

./tpi -T 4 -o pi.txt 10M

It should go faster.

TPI needs about 6.4 times the size of the result as RAM. It
corresponds to about 2.7 times the number of digits as bytes of
RAM. Hence for 1G digits, you need at least 2.7 GB of RAM. Having more
RAM than this minimum yields a faster computation.

1.2) Disk based computations


./tpi -T 4 -m 1Gi -d 10G -p /tmp/pi -o /tmp/pi/pi_base10.txt 1G

to compute 10^9 digits of pi using at most 1GiB of RAM and 10GB of
disk space. The temporary files are created in /tmp/pi. The result is
in /tmp/pi/pi_base10.txt. 4 cores are used. Warning: it can be long.

TPI needs at least 6.4 times the size of the result as disk space. It
corresponds to about 2.7 times the number of digits as bytes of
storage (same constraint as in the RAM only case).

The more RAM you have, the faster is the computation.

If the computation is interrupted, you can restart it with the
"restart" command by using the latest checkpoint:

./tpi -T 4 -m 1Gi -d 10G -p /tmp/pi -o /tmp/pi/pi_base10.txt restart /tmp/pi/1234-state45

where "/tmp/pi/1234-state45" is the latest checkpoint written to disk.

2) Usage

usage: tpi [options] [n_digits|command] [params]

Options are:

-T n       

Set number of threads to 'n' (default=1)

-o file    

Set output file (default = none)

-m n

Set max memory size to 'n' bytes. 

-p path

Set disk path (default = no disk use)

-d n       

Set max disk size to 'n' bytes. By default all the disk is used.

-f format  

Set output format to 'format': txt or bin (default = txt). 

The binary format is more compact and much faster than the text
format. It is the native format of the internal arbitrary precision
arithmetic library. Digits stored using the binary format can be
displayed using the tpidump utility. Take care to specify a display
base similar as the one used for the computation. Source code is
included to document the file format.

A limitation of this version is that the binary format can only be used
if a disk path is specified with the -p option. The binary file must
be in the directory specified with -p.

-a formula 

Use 'formula':
chud    Chudnovsky formula (default and fastest)
rama    Ramanujan formula (slower, useful for verifications)


Output binary result (default = base 10 result)


Enable final multiplication and base conversion check. Only useful for
very large computation where hardware errors are likely.


Display memory usage and statistics


Use raw disk I/Os (Linux only). Raw disk I/Os can be faster for huge
computations, but you need a suitable OS and filesystem to support it
(Linux 2.6.x and ext4 are good choices).

Commands are:

calc n_digits        

Pi computation. If no command is specified, this is the default command.

restart file1 [...]  

Restart computation using a checkpoint. Also used to merge several
checkpoints coming from the "split" command.

split n_digits n     

Generate checkpoint files to split among n processes (usually run on
different machines). The "restart" command is used on each machine to
start the computations using the generated checkpoint files. Then the
results are combined with another "restart" on a single machine to
finish the computation. This feature can be useful to distribute the
computation on several PCs. Only the "binary splitting" phase can be
parallelized this way.


./tpi -p /tmp/pi -o /tmp/pi_base10 split 10M 2

./tpi -p /tmp/pi -o /tmp/pi_base10 restart /tmp/pi/x-state1
./tpi -p /tmp/pi -o /tmp/pi_base10 restart /tmp/pi/x-state2

./tpi -p /tmp/pi -o /tmp/pi_base10 restart /tmp/pi/x-stateN1 /tmp/pi/x-stateN2

SI or binary prefixes can be used when entering numbers (e.g. 1k=1000,
1Ki = 1024, 1M = 10^6, 1Mi = 2^20, ....). Scientific notation is
allowed too (e.g. 1000000=1e6=1M).

The maximum number of Pi digits that can be computed with this program
is 2.7*10^12. People wishing to compute more digits should contact the
author !

3) More information

More information on the algorithms is available at:


It should be noted that although this program implements a library
having an API similar to the GNU Multiprecision Library, it has no
common code with it.

4) License

TPI is available free of charge, but its redistribution on any medium
needs written consent of the author.

TPI is available without any express or implied warranty. In no event
will the author be held liable for any damages arising from the use of
this software.

Fabrice Bellard.