a-periodic
randomness of pi is confirmed by the modern DIEHARD en ENT test suites. Also by
7 classical tests programmed in PASCAL from Knuth The Art Of Computer Programming.
It passes 36 out of 38 of these tests. 228 Kb random bits from a quantum
mechanical source (HotBits) passes 16 out of 21 of these tests. It is a pitty
that the amount of random bits does not allow to do the DIEHARD test suite. The
conclusion is that the randomness of pi and HotBits are comparable.
SATOCONOR.COM
J.G. van der Galiën ‘Testing the A-periodic Randomness
of PI’ 3.2. (2004)
Communications to the editor
SATOCONOR.COM Journal of RANDOMICS
By
Johan G. van der Galiën
For comments: johan.van.der.galien@satoconor.com
Version 1.1. January 9, 2005
22 Mb of the first hexadecimal digits of π as ASCI files were
produced (one digit per byte) with the three methods APTEST has in store
(Chudnovsky’s bin split, Gauss-Legendre, Borwein’s quartic). APTEST is very
convenient for this purposes because it can calculate the digits in base 2 –
36. These *.TXT files were transformed to *.DAT files with a self-written
PASCAL program (2 hexadecimal digits stored in a byte). The size of these *.DAT
files (11 Mb) made it possible to do the DIEHARD
Also the ENT test suite was done on all three files. Four tests
developed by my self were applied to complete the test series. From the results
one can only deduce that the digits of π are indeed truly random. (Passes
36 out of 38 tests) I convinced now that the digits of π belongs to the
best sources of randomness deterministic processes can offer. This randomness
is equally good as from a Quantum Mechanical source (HotBits). (Passes 16 out
of 21 tests.)
The fact that a true random source will not always pass a
probability-based test is discussed. To put it stronger occasionally failure is
a requirement for true randomness of a source.
But how is it possible that with deterministic methods (π is a
Pseudo-RNG) one can simulate Quantum Randomness (True-RNG's, which in all its
forms can also be considered as a-periodic)? To such an extent that both
score more or less equally good at random tests. In my eyes this constitutes a
paradox and it can only mean that our view of Heisenberg's Uncertainty
Principle has to be reviewed.
π is an irrational, non-algebraic and
transcendental number.1a To my knowledge there are only 21 numbers
known in this class including sin(1), ln(2), π and e. A property that all
the transcendental and irrational numbers have is that the digit sequence of
the fraction never repeats (a-periodic). This in contrast with the rational
numbers which with long divisions either ends in all zeroes or an infinite repeating
sequence (period). It is an interesting question, but beyond the scope of this
article, if the randomness arises from the non-algebraic property or that the
trivial irrational numbers are also an excellent source of randomness. May be
in a subsequent article I will go in to this question.
I found very little information
about randomness tests and the digits of π on the internet. Only two
citations from Bailey (1988) and Chudnovsky and Chudnovsky (1991),1b
which say as much as that in that era only all classical randomness tests
like frequency, chi square, poker, arctan law, ... etc were done
by Chudnovsky and Chudnovsky,
on the billion plus range, and that π passes them all with flying colours!
In the research presented in this paper the number of tests applied on π
is expanded with the 17 tests of DIEHARD, 12 of ENT and 4 tests developed by my
self. (All of these tests can be called modern this in contrast of the
classical tests done by Chudnovsky and Chudnovsky.) Additionaly
7 classical randomness tests were done to complete the test series.
|
|
Chudnovsky |
Gauss-Legendre |
Borwein |
|
Size
*.TXT file byte |
23,068,670
(+2) |
23,068,663
(+2) |
23,068,656
(+2) |
|
Size
*.DAT file byte |
11,534,335 |
11,534,331 |
11,534,328 |
Table 1: The sizes of the final *.DAT file and of the
intermediate *.TXT files all made by C:\>APTEST 11534336 x 16 > *.TXT (x
= 0: Chudnovsky, x = 1: Gauss-Legendre or x = 2: Borwein)
First a C-implementation (PIQP.C) of the remarkable
Bailey-Borwein-Plouffe formula was tried if it could produce the needed 11 Mb
of hexadecimal digits.2 But an analysis of the execution times of
producing 500 and 50,000 bytes revealed that the algorithm is O(n2).
And that the 11 Mb would require 39 years on a 700 MHz AMD Athlon.
For the production of the hexadecimal digits I
finally used APTEST.EXE, which can be downloaded from the internet.3
It is number
APTEST produced the 22 Mb file with writing to the
standard out in 3283.060 seconds (700 MHz AMD). This comes down to 156.9 Kbits
per second (4.0 GHz AMD) when you use 1 hexadecimal digit per byte = 4 bits.
This is 17 times slower than FRAGPAS1.EXE.12 The maximum APTEST can
produce is 226 million digits. This comes down to about 113 Mb random bits.
FRAGPAS1.EXE already can produce 66 Gb and hopefully in the future this will be
near infinite!
The three files made by APTEST, according to the
different methods, are not all equal in size as you can see from Table 1. I
remind you that I had to remove the preceding 3. from the 3.243……….. sequence
in the *.TXT files because of the dot. Also some data about the running of the
program at the end of the *.TXT had to be removed.
The results of the DIEHARD battery are given in Table
2. DIEHARD is developed by Professor Dr. George Marsaglia from research at the
Department of Statistics and Supercomputer Computations Research Institute of
the
|
229
p-values |
Chudnovsky |
Gauss-Legendre |
Borwein |
|
KS-test
p-value |
0.221918 |
0.221918 |
0.221918 |
|
Number
of p = 0 |
0 |
0 |
0 |
|
Number
of p = 1 |
0 |
0 |
0 |
Table 2: A
very brief summary from the plethora of data the DIEHARD battery produces.
The results from RANTESTS are given in Table 3.
RANTESTS is programmed by my self according to the algorithms found in the
excellent book series of Knuth.6
|
|
Chudnovsky |
Gauss-Legendre |
Borwein |
|
|
|
||
|
Entropy
bits/bit |
0.99999999728 |
0.99999999727 |
0.99999999729 |
|
|
|
||
|
Chi-bits |
0.348 |
0.350 |
0.347 |
|
5%
1 degree |
0.00 |
||
|
95%
1 degree |
3.84 |
||
|
|
|
||
|
Chi-hexadecimal |
6.30 |
6.30 |
6.30 |
|
5%
15 degrees |
7.26 |
||
|
95%
15 degrees |
25.0 |
||
|
|
|
||
|
KS-analysis |
|
||
|
KnPlusMax |
0.3247 |
0.3245 |
0.3238 |
|
KnMinusMax |
0.3249 |
0.3247 |
0.3240 |
|
Kn
at 5% |
0.1601 |
||
|
Kn
at 95% |
1.2238 |
||
|
|
|
||
|
Chi-serial |
229 |
229 |
229 |
|
5%
255 degrees |
219 |
||
|
95%
255 degrees |
293 |
||
|
|
|
||
|
Chi-differential |
19.1 |
19.1 |
19.1 |
|
5%
30 degrees |
18.5 |
||
|
95%
30 degrees |
43.8 |
||
|
|
|
||
|
Chi-gap |
41.8 |
41.8 |
41.8 |
|
5%
50 degrees |
34.8 |
||
|
95%
50 degrees |
67.5 |
||
Table 3: De
results from RANTESTS.
The results from the ENT test battery are given in
Table 4. ENT can be downloaded from the Fourmilab in
|
|
Chudnovsky |
Gauss-Legendre |
Borwein |
|
Entropy bits/bit |
1.000000 |
1.000000 |
1.000000 |
|
Entropy bits/byte |
7.999986 |
7.999986 |
7.999986 |
|
Compression |
0% |
0% |
0% |
|
Chi-bits 1 degree |
0.35 |
0.35 |
0.35 |
|
Probability |
50% |
50% |
50% |
|
Chi-bytes 255 degrees |
229.01 |
229.02 |
229.04 |
|
Probability |
75% |
75% |
75% |
|
Mean value bits |
0.5000 |
0.5000 |
0.5000 |
|
Mean value bytes |
127.5041 |
127.5041 |
127.5041 |
|
|
3.140573526 |
3.140575160 |
3.140575160 |
|
Error π |
0.03% |
0.03% |
0.03% |
|
Serial Correlation bits |
-0.000052 |
-0.000052 |
-0.000052 |
|
Serial Correlation bytes |
-0.000213 |
-0.000213 |
-0.000213 |
Table 4: The
results from the ENT test suite. Because ENT can run in bit and byte mode some of
the tests are double.
I also developed four randomness tests my self and
the methods I used are described on my website.8 This test is called
RABENZI1 and uses Benford fit and Zipf correlation of the First Digit
Distribution from random bits turned in to 64 bit Double and 48 bit Real48
floating point numbers. I applied these tests to only one of the files from
APTEST (CHUDNOVSKY.DAT), because it is now more or less clear, from the
previous results, that the three binary files from APTEST results are equal.
Results are given in Fig. 1.
----------------------------
Benford and Zipf randomness tests
with Double and Real48 numbers for CHUDNOVSKY.DAT
-------------DOUBLE---------------
Total amount 64 bit chunks read =
1441791
Total amount of Double numbers between
-10^308 and 10^308 = 1440430
Total amount of Denormals = 741
----------------------------
Test for the fit with the Law of
Benford (Real48)
CHI-Benford Double = 1.12454763731785E+0001
CHI 8 degrees of freedom 5% = 2.73
CHI 8 degrees of freedom 95% =
15.51
----------------------------
Test for correlation with Zipf
(Double)
Slope = -8.63920658569172E-0001
Intersection =
-5.03657601092755E-0001
Correlation coefficient =
-9.99225425068603E-0001
-------------REAL48---------------
Total amount of 48 bit chunks read
= 1922389
Total amount of Real48 numbers
between -10^38 and 10^38 = 1908766
Total amount of Denormals and
Zeroes = 7440
----------------------------
Test for the fit with the Law of
Benford (Real48)
CHI-Benford Real48 = 1.09145860158364E+0002
CHI 8 degrees of freedom 5% = 2.73
CHI 8 degrees of freedom 95% =
15.51
----------------------------
Test for correlation with Zipf
(Real48)
Slope = -8.55375515760612E-0001
Intersection =
-5.07213284742048E-0001
Correlation coefficient =
-9.99175338471230E-0001
----------------------------
Fig. 1: The
RABENZI1 tests applied on CHUDNOVSKY.DAT
In order to determine how the three files of APTEST
differ I used the DOS FC /B command (File Compare), results are given in Fig.
2.
Busy
comparing the files CHUDNOVSKY.DAT and GAUSS-LEGENDRE.DAT
00AFFFF9:
2D 1D
00AFFFFA:
91 FF
FC:
CHUDNOVSKY.DAT longer than GAUSS-LEGENDRE.DAT
Busy
comparing the files CHUDNOVSKY.DAT and BORWEIN.DAT
00AFFFF5:
E4 F4
00AFFFF6:
EE 10
00AFFFF7:
99 11
FC:
CHUDNOVSKY.DAT longer than BORWEIN.DAT
Busy
comparing the files GAUSS-LEGENDRE.DAT and BORWEIN.DAT
00AFFFF5:
E4 F4
00AFFFF6:
EE 10
00AFFFF7:
99 11
FC:
GAUSS-LEGENDRE.DAT longer than BORWEIN.DAT
Fig. 2: The
final *.DAT files compared with the DOS FC /B command.
From Table 1 can be concluded that APTEST does not
produce exactly the amount of bytes specified on the DOS-prompt. But I must say
here that I removed the preceding 3., may be the dot is counted as a digit to
by APTEST. Than I can add 2 bytes to the sizes of the *.TXT files. For
Chudnovsky the figure is than exactly the amount of specified bytes.
Gauss-Legendre is than still to small and stays also odd. Borwein is also
smaller. May be it would have been better to cut and resize the *.TXT files so
that they all have an even number of bytes by either removing only the dot or
removing the dot and the last digit. It also seems to be that the *.TXT files
contains a line feed and carriage return (invisible) characters at the end of
the hexadecimal digits making the printing of the data from the running of the
program possible at (a new line) the bottom lines. The last two bytes will then
be the same after conversion as the third last byte with really hexadecimal
digits of π. I am not certain of this since the line feed and carriage
return are invisible characters. The need for removing the dot comes from the
fact that TEXTDATCONV.EXE makes an extra 3 out of it. And if you take a look at
the source code of this conversion program you will see that with an odd amount
of bytes in the *.TXT file one byte is lost. This is also in agreement with the
data from Table 1 (Gauss-Legendre).
From the DIEHARD battery (Table 2) is clear that
according to these tests the files are equal (in other words the hexadecimal
digits are the same) despite the small difference in file size. This new
version of DIEHARD (0.2) also implements some difficult-to-pass tests (GCD,
Gorilla and Birthday Spacings)
The first 11 Mb hexadecimal digits of π only do
not pass the Chi-hexadecimal test of RANTESTS suite as you can see from Table
3. The Chi-hexadecimal of 6.30, which is found for all three *.DAT files, has a
probability of 2.58%. That is real close to the 5% criterion. 2.58% actually
means that a truly random source would produce such a Chi only once in 39
experiments. This also means that a truly random source will not always pass a
Chi-based test. In other words if a source with entropy near 1 would always
pass all Chi-based tests than there would be something odd and really
interesting going on. But such a source CANNOT be considered random! The small
differences in Entropy, Chi-bits and KS-analysis can be explained by the
different sizes of the *.DAT files. To summarise this 6 of the 7 tests from the
RANTESTS suite are passed. And I believe that if you make larger files or take
for instance the 11 Mb – 22 Mb digits of π the picture of tests passed and
failed could be entirely different. May be the Chi-hexadecimal test is then
passed while others are failed.
------------------------------
Size of file hb228kb.dat = 233472 byte
------------------------------
CHI-Bits = 2.33418996710679E+0000
CHI 1 degree of freedom 5% = 0.00
CHI 1 degree of freedom 95% = 3.84
Entropy = 9.99999098519766E-0001 bits per bit
------------------------------
Number of 4 bit chunks read = 466944
CHI-Hexadecimal= 3.10106907894660E+0001
CHI 15 degrees of freedom 5% = 7.261
CHI 15 degrees of freedom 95% = 25.00
------------------------------
KS-analysis
KnPlusMax = 1.68292763018508E+0000
KnMinusMax = 1.68146421485471E+0000
Kn/Probability Distribution at 1% = 7.06466578772051E-0002
Kn/Probability Distribution at 5% = 1.59903945165297E-0001
Kn/Probability Distribution at 25% = 3.79022047249691E-0001
Kn/Probability Distribution at 50% = 5.88463250286622E-0001
Kn/Probability Distribution at 75% = 8.32312850187009E-0001
Kn/Probability Distribution at 95% = 1.22363165436946E+0000
Kn/Probability Distribution at 99% = 1.51718536841508E+0000
------------------------------
CHI-Serial = 2.70743421051186E+0002
CHI 255 degrees of freedom 5% = 219.0
CHI 255 degrees of freedom 95% = 293.3
----------------------------
CHI-Delta = 2.68134236818878E+0001
CHI 30 degrees of freedom 5% = 18.49
CHI 30 degrees of freedom 95% = 43.77
------------------------------
CHI-Gap = 6.32252588947886E+0001
CHI 50 degrees of freedom 5% = 34.8
CHI 50 degrees of freedom 95% = 67.5
------------------------------
Fig. 3: RANTESTS results on a 228 Kb
Hotbits file.
From Fig. 3 can be concluded that the “Holy Grail” of
randomness: Quantum Mechanics (in this case HotBits from β-decay of 85Krypton),
also does not always pass probability-based tests like Chi-hexadecimal and
Kolmogorof-Smirnof (KS) Analysis. Only 5 out 7 tests from the RANTESTS suite
are passed. Lets also compare the ENT suite results on HotBits, results given
in Fig. 4.
Entropy = 0.999999 bits per bit.
Optimum compression would reduce
the size
of this 1867776 bit file by 0
percent.
Chi square distribution for 1867776
samples is 2.33, and randomly
would exceed this value 25.00
percent of the times.
Arithmetic mean value of data bits
is 0.4994 (0.5 = random).
Serial correlation coefficient is
-0.000128 (totally uncorrelated = 0.0).
Entropy = 7.999164 bits per byte.
Optimum compression would reduce
the size
of this 233472 byte file by 0
percent.
Chi square distribution for 233472
samples is 270.74, and randomly
would exceed this value 25.00
percent of the times.
Arithmetic mean value of data bytes
is 127.2071 (127.5 = random).
Serial correlation coefficient is
0.000039 (totally uncorrelated = 0.0).
Fig. 4: ENT
in bit and byte mode applied to a 228 Kb HotBits file.
From Fig. 4 can be concluded that HotBits scores less
on Mean value bits and bytes, on the
From Table 4 you can see that ENT confirms the
Entropy in bits/bit from RANTESTS. This differs only in rounding and number of
significant digits. The Entropy in bits/byte is also not bad at all. The only
criteria for truly random Entropy are to my knowledge around 1 bits/bit or 8
bits/byte. But of course Entropy is related to Chi-bits or Chi-bytes thus it
should be possible to calculate the 5% and 95% criteria for the Entropy. But
you already have Chi-bits and Chi-bytes so why bother? This brings to a little
problem not all tests of ENT have strict criteria for randomness. Of course
compression should be always 0%. But for instance the Mean value bits should be
0.5 and Mean value bytes should be 127.5. But where are the upper and lower
limits? This also true for the error in the Monte Carlo calculations of π,
and the Serial Correlation Coefficient (Should these be exactly 0 for an
infinite true random file?). Here again, no upper and lower limits! But none
the less: common sense and experience with ENT on other truly random sources
tells you that all *.DAT files pass the ENT test suite.
From Fig. 1 one can see that the Chi-based Benford
fit test for Real48 is not passed. The Chi-Benford is 109.1 while the criteria
of 5% and 95% probability are 2.73 and 15.51 respectively. But as I said for
Chi-hexadecimal this can be a matter of the fluctuations in probability. The
Zipf correlation tests are passed if one knows that the (preliminary, not fully
researched and published) criterion for the Correlation Coefficient is <
-0.999. I have no criteria for the Intersection and Slope yet. But I do know
that for an infinite random file the Slope should be around –0.8636655870, the
Intersection around -0.5037512926 and the Correlation Coefficient around
–0.9992296195. Because these are the results on the linear regression of
Benford’s formula against Zipf’s Law.10
----------------------------
Benford and Zipf randomness tests
with Double and Real48 numbers for hb228Kb.dat
-------------DOUBLE---------------
Total amount 64 bit chunks read =
29184
Total amount of Double numbers
between -10^308 and 10^308 = 29160
Total amount of Denormals = 15
----------------------------
Test for the fit with the Law of
Benford (Real48)
CHI-Benford Double = 1.22455742347036E+0001
CHI 8 degrees of freedom 5% = 2.73
CHI 8 degrees of freedom 95% =
15.51
----------------------------
Test for correlation with Zipf
(Double)
Slope = -8.66572987353122E-0001
Intersection =
-5.02362168514781E-0001
Correlation coefficient =
-9.98055735736724E-0001
-------------REAL48---------------
Total amount of 48 bit chunks read
= 38912
Total amount of Real48 numbers
between -10^38 and 10^38 = 38607
Total amount of Denormals and
Zeroes = 179
----------------------------
Test for the fit with the Law of
Benford (Real48)
CHI-Benford Real48 = 2.38680796493998E+0001
CHI 8 degrees of freedom 5% = 2.73
CHI 8 degrees of freedom 95% =
15.51
----------------------------
Test for correlation with Zipf
(Real48)
Slope = -8.50949947955034E-0001
Intersection =
-5.09592282222058E-0001
Correlation coefficient =
-9.98270353715081E-0001
----------------------------
Fig. 5: The
RABENZI1 tests applied to the 228 Kb HotBits file.
From Fig. 5 you can see that HotBits scores worse
than CHUDNOVSKY.DAT with RABENZI1 tests. Only the Chi-Benford Double is passed!
What we learn from this is that not only deterministic methods do not pass
certain tests (from fluctuations caused by probability?)
From Fig. 2 it is clear that the differences of the
*.DAT files is in the tail. With CHUDNOVSKY.DAT and GAUSS-LEGENDRE.DAT the
differences are in the 11,534,329th and 11,534,330th byte
position. As an indication the Chudnovsky, Gauss-Legendre and Borwein *.DAT
files are 11,534,335, 11,534,331 and 11,534,328 bytes long. It seems to be that
the byte counter of FC starts with a zero for the first byte. So then the last
4 digits of GAUSS-LEGENDRE.DAT are certainly wrong. CHUDNOVSKY.DAT and BORWEIN.DAT
differ from the 11,534,325th to the 11,534,327th byte
position. (Last 6 digits certainly wrong of BORWEIN.DAT.) GAUSS-LEGENDRE.DAT
and BORWEIN.DAT also differ from the 11,534,325th to the 11,534,327th
byte position. Anyone who has experience with π calculating programs knows
that the last few digits are always not significant (uncertain) because of
rounding errors. APTEST does not score badly in this: only 6, at the most, are
different at the end of about 23,068,670 hexadecimal digits.
APTEST produces genuine hexadecimal digits of π,
only the last 4 – 6 digits are wrong, for all three methods, because the
randomness is as expected for an irrational and transcendent number. These
digits of π pass 36 of 38 randomness tests. HotBits from Quantum Mechanics
scores 16 out of 21. The results are highly sensitive to probability variations
because it is of course random data! The results of π are comparable to
this Quantum Mechanical source taken in to account that the results differ
slightly by probability and the fact that the used HotBits file is 50 times
smaller. Both are a-periodic random sources, which I did not demonstrate but
these facts have almost become an axiom in present day science, so there is no
need to.
How can it be that the digits of π are equally
(a-periodic) random as sources from Quantum Mechanics? Producing the digits of
π is a deterministic process but yet Quantum Randomness is simulated! In
my eyes this constitutes a paradox that needs to be explained. How can a
deterministic process simulate properties that belong to the realm of the
Uncertainty Principle of Heisenberg? One can only speculate that Quantum
Mechanics to must have a deterministic basis. Which is also the underlying
thought of the famous EPR-paradox article of Einstein, Rosen and Podolsky.11
That inspired Einstein to its famous saying: “Der lieber Gott wurfelt nicht.”
Of course there still is a difference: with brute force methods one can always
predict the, lets say, millionth bit of a sample from π, no matter where
you start in the whole sequence. Every bit from Quantum Mechanical sources is
called intrinsically uncertain. But yet both have the same properties with
respect to randomness tests! Shigeru Kondo reports that Kanada and
Hitachi, Ltd., have achieved a run of 1,241,100,000,000 digits of π (world
record November 2002). So the digits above this figure are still uncertain and
this is the same kind of uncertainty as random numbers between 0 and 9 from
Quantum Mechanics. You only know both kind of digits until you can calculate or
measure them. Maybe you do not agree? You will certainly agree with me that the
(near) infinite digits of π are uncertain and that this is the same kind
of uncertainty as from Heisenberg's Uncertainty Principle. That both random
sources are a-periodic is the origin for this same kind of uncertainty.
-o0o-
Notes
& References:
1a) Anonymous Mathworld ‘Trancendental numbers’
http://mathworld.wolfram.com/TranscendentalNumber.html
1b) P. Vanouplines, Free University
Brussels, ‘Rescaled range analysis and the fractal dimension of pi’
http://homepages.vub.ac.be/~pvouplin/pi/newresea.htm
2) Anonymous Mathworld ‘BBP-formula’
http://mathworld.wolfram.com/BBPFormula.html
3) Tommila M. ‘ApFloat home page’ (I downloaded APTESTC5.ZIP)
4) Anonymous Stu’s Pi Page ‘The fastest PI programs’
http://home.istar.ca/~lyster/chart.html
5) Anonymous ‘DIEHARD battery of tests of randomness v0.2 beta’
http://www.cs.hku.hk/~diehard/
6)
Knuth, D.E 'The art of computer programming, Volume 2 / Seminumerical
algorithms'
7) Walker J. ‘Fourmilab table of contents’
8) Van der Galiën J.G. ‘Proposal for a new randomness test (RABENZI)’ Scientia
Araneae Totius Orbis 3.3. (2004)
9) Marsaglia G., Wai Wan Tsang, ‘Some difficult to pass tests’
http://www.jstatsoft.org/v07/i03/tuftests.pdf
10) Van der Galiën J.G. ‘Factorial randomness’ Scientia Araneae
Totius Orbis 2.3. (2003)
11) Einstein A., Podolsky B., Rosen N. ‘Can Quantum-Mechanical
Description of Physical Reality Be Considered Complete? Physical Reviews
47 777 – 780 (1935)
Abstract: http://prola.aps.org/abstract/PR/v47/i10/p777_1
12) Van der Galiën J.G. ‘A Factorial Randomness Generator (FRAG PRNG)’ Scientia
Araneae Totius Orbis 3.1. (2004)