Frequently Asked Questions
Summary
I am not especially interested in the digits of Pi, but in the various
algorithms involved to do arbitrary-precision arithmetic. Optimizing these
algorithms to get good performance is a difficult programming
challenge.
Arbitrary-precision arithmetic with huge numbers has little practical
use, but some of the involved algorithms are interesting to do other
things. In particular:
- The Discrete Fourier Transform. This transform is widely used in
many algorithms and most modern electronic appliances (such as
digital televisions, cell phones and music players) include at least
one instance of it.
- The reliable managing of a very large amount of disk storage, at
least for a single computer. Specific methods were developed to
ensure high reliability and high disk I/O bandwidth. The same ideas
can be applied to other fields such as video streaming or data base
access.
- The whole computation is an extensive test for a computer including
its CPU, RAM, disk storage and cooling system. A single bit error
during the computation gives a bad result. A bad cooling results in
a hardware failure.
The previous Pi computation record of
about 2577
billion decimal digits was published by Daisuke Takahashi on
August 17th 2009. The main computation lasted 29 hours and used 640
nodes of a T2K
Open Supercomputer (Appro Xtreme-X3 Server). Each node contains 4
Opteron Quad Core CPUs at 2.3 GHz, giving a peak processing power of
94.2 Tflops (trillion floating point operations per second).
My computation used a single Core i7 Quad Core CPU at 2.93 GHz giving
a peak processing power of 46.9 Gflops. So the supercomputer is about 2000
times faster than my computer. However, my computation lasted 116
days, which is 96 times slower than the supercomputer for about the
same number of digits. So my computation is roughly 20 times
more efficient. It can be explained by the following
facts:
- The Pi computation is I/O bound, so it needs very high
communication speed between the nodes on a parallel supercomputer. So
the full power of the supercomputer cannot really be used.
- The algorithm I used (Chudnovsky series evaluated using the
binary splitting algorithm) is asymptotically slower than the
Arithmetic-Geometric Mean algorithm used by Daisuke Takahashi, but it
makes a more efficient use of the various CPU caches, so in practice
it can be faster. Moreover, some mathematical tricks were used to
speed up the binary splitting.
Yes, the result matches the decimal
digits of Pi published by Daisuke Takahashi and the hexadecimal
and decimal
digits published by Kanada.
The files have a size of more than 1 TB, so it would take a while to download
them, even with a fast Internet connection (e.g. 10 days with a 10Mb/s
download speed). Instead, I may add a web page where the user can ask
the digits at a specific position. Extracts of the decimal and hexadecimal digits are available here.
It will depend on my motivation and on the availability of new
hardware with larger and faster storage. Interested sponsors can
contact me.
Yes, it is available here for Linux and Windows
(64 bit only).
I have not decided yet.
The Pi computation algorithm I used is I/O bound, so the extra
processing power of the GPU would not help much. What really matters
is the speed of the hard disks. The question would be different if the
computation could be done in RAM, but it is currently too expensive,
at least for me !
I suggest:
- Modern Computer Arithmetic by Richard Brent and Paul
Zimmermann, version 0.4, November 2009, Full text available here.
- The Art of Computer Programming, volume 2 : Seminumerical
Algorithms by Donald E. Knuth, Addison-Wesley, third edition,
1998. More information here.
I am a software engineer currently working in the field of Digital
Television. I live in Paris, France.
Back to Home