Battery of statistical tests by David Sexton
1. Tests descriptions
2. Summary of results for VMPC Stream Cipher
3. Author's comment on the result

1. Tests descriptions

The P value is a percentile; it is the probability of a statistic equal to or greater than the one reported. P values for a given test should be more-or-less evenly distributed between 0 and 1. One asterisk (*) appears to the right of a P value less than 0.1, two to the right of a P value less than 0.01, three to the right of a P value less than 0.001, and four to the right of a P value less than 0.0001. One caret (^) appears to the right of a P value greater than 0.9, two to the right of a P value greater than 0.99, three to the right of a P value greater than 0.999, and four to the right of a P value greater than 0.9999.

All the statistics calculated are chi-squared statistics. An effort was made to keep all the expected counts in each chi-squared calculation high. Expected counts in the following tests (with a minimum test sequence length of one megabyte) are never lower than 2048.

The "cumulative sum" chi-squared statistic for each test is calculated by summing the statistics for all the keys (or sequences) tested. The degrees-of-freedom for the resulting chi-squared statistic is the degrees-of-freedom for the test multiplied by the number of statistics summed.

The tests are mostly byte-oriented because the PRNGs I was most interested in testing produce their output as bytes. The tests devised by Marsaglia and Tsang, which are often used to test stream cipher PRNGs, are mostly oriented toward 32-bit words.

Bit Runs Test

A run is one or more consecutive bits with the same value (either 0 or 1). Whenever a bit has a value different from the preceeding bit, a new run is begun. For example, 01001011010100 consists of eleven runs: 0 1 00 1 0 11 0 1 0 1 00. In a random sequence, approximately half the runs should be one bit long, a fourth should be two bits long, an eighth should be three bits long, etc. Runs of 1 to 10 bits are counted. Runs of 11 bits and longer are lumped together. A chi-squared statistic is calculated. The degrees-of-freedom is 10.

Alternate Bit Runs Test

A run is one or more consecutive bits that alternate in value between 0 and 1 in a regular pattern. The pattern is specified by a parameter passed to the function. If the parameter is 1, the pattern is 010101... or its complement 101010...; if the parameter is 2, the pattern is 001100110011... or its complement 110011001100...; if the parameter is 3, the pattern is 000111000111... or its complement 111000111000...; and so on. Whenever the pattern is not followed, a new run is begun. For example, if the parameter is 1, the sequence 01001011010100 consists of the following four runs: 010, 0101, 101010, 0. If the parameter is 2, the same sequence consists of the following nine runs: 0, 1, 001, 0, 110, 1, 0, 1, 00. In a random sequence, approximately half the runs should be one bit long, a fourth should be two bits long, an eighth should be three bits long, etc. Runs of 1 to 10 bits are counted. Runs of 11 bits and longer are lumped together. A chi-squared statistic is calculated. The degrees-of-freedom is 10.

Block Bit Frequency Test

The sequence is divided into 256 blocks of equal length. The total number of bits of each possible value in each block is counted. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 256.

Bit Frequency Test

The total number of bits of each possible value is counted. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

Tidbit Frequency Test

The total number of tidbits of each possible value is counted. In a random sequence the probability of each possible tidbit value is 1/4. A chi-squared statistic is calculated. The degrees-of-freedom is 3.

Nibble Frequency Test

The total number of nibbles of each possible value is counted. In a random sequence the probability of each possible nibble value is 1/16. A chi-squared statistic is calculated. The degrees-of-freedom is 15.

Byte Frequency Test

The total number of bytes of each possible value is counted. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

Bit Test

One bit of each byte of the sequence is evaluated in this test. The bit is specified by a parameter passed to the function. This parameter will be a number from 0 to 7: where 0 specifies the lowest-order bit, and 7 specifies the highest-order bit. The value of the specified bit of each byte in the sequence will be either 0 or 1. In a random sequence the probability of each possible bit value is 1/2. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

Overall Bit Test

The bit test descibed above is performed for each of the 8 bit positions. The resulting chi-squared statistics for each of the 8 tests are totaled; the total is an overall chi-squared statistic. The degrees-of-freedom is 8.

8-Bit-Sum Test

The bits of each byte of the sequence are summed. The bit sum will be either 0, 1, 2, 3, 4, 5, 6, 7, or 8. In a random sequence, the probabilities of 0, 1, 2, 3, 4, 5, 6, 7, and 8 are 1/256, 8/256, 28/256, 56/256, 70/256, 56/256, 28/256, 8/256, and 1/256 respectively. The total number of bytes with each possible bit sum is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 8.

16-Bit-Sum Test

The sequence is divided into segments of 16 bits. The bits of each segment are summed. The bit sum will fall within one of three ranges: less than 7, greater than 9, or greater than 6 and less than 10. In a random sequence, the probabilities of less than 7 and greater than 9 are both 14,893/65,536; and the probability of greater than 6 and less than 10 is 35,750/65,536. The total number of segments with bit sums in each of the 3 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

32-Bit-Sum Test

The sequence is divided into segments of 32 bits. The bits of each segment are summed. The bit sum will fall within one of three ranges: less than 15, greater than 17, or greater than 14 and less than 18. In a random sequence, the probabilities of less than 15 and greater than 17 are both 1,281,220,733/4,294,967,296; and the probability of greater than 14 and less than 18 is 1,732,525,830/4,294,967,296. The total number of segments with bit sums in each of the 3 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

64-bit Matrix Test

The sequence is divided into 64-bit (8-byte) segments. Eight bytes are built from the 64 bits. The first byte built is made of the lowest-order bits of each byte in the segment, the second is made of the next-lowest-order bits, etc. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

256-bit Matrix Test

The sequence is divided into segments of eight 32-bit words. Eight bytes are built from these 256 bits. The first byte built is made of the lowest-order bits of each 32-bit word in the segment, the second is made of the next-lowest-order bits, etc. The total number of resulting bytes of each possible value is totaled over the whole sequence. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

Bit Prediction A Test

An algorithm is used to predict the value of each bit of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/2. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1. The algorithm is as follows: the numbers of zeros and ones in all the previous bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted. Otherwise the prediction is the same as for the previous bit. The interesting part of this test is the prediction made when the numbers of zeros and ones are equal.

Bit Prediction B Test

An algorithm is used to predict the value of each bit of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/2. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1. The algorithm is as follows: the numbers of zeros and ones in the previous 9 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

Bit Prediction C Test

An algorithm is used to predict the value of each bit of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/2. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1. The algorithm is as follows: the numbers of zeros and ones in the previous 17 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

Bit Prediction D Test

An algorithm is used to predict the value of each bit of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/2. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1. The algorithm is as follows: the numbers of zeros and ones in the previous 33 bits are counted. If the ones outnumber the zeros, a zero is predicted; if the zeros outnumber the ones, a one is predicted.

Byte Prediction A Test

An algorithm is used to predict the value of each byte of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/256. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1. The algorithm is as follows: the next byte is predicted to be equal to all the previous bytes bitwise XORed together.

Byte Prediction B Test

An algorithm is used to predict the value of each byte of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/256. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1. The algorithm is as follows: the next byte is predicted to be equal to the sum of all the previous bytes, modulo 256.

Byte Prediction C Test

An algorithm is used to predict the value of each byte of the sequence from the beginning of the sequence to the end. In a random sequence the probability of success of any such algorithm is 1/256. The number of successes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1. The algorithm is as follows: the next byte value is predicted to be zero until the first zero is found. From that point on, the next byte value is predicted to be the byte value whose last appearance was furthest back in the sequence.

Byte Repetition Test

Each byte in the sequence either is or is not identical to its preceding byte. In a random sequence the probability that a byte will be identical to its preceding byte is 1/256. The total number of bytes identical to their preceding bytes is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1. This test is equivalent to a byte prediction test where each byte is predicted to be equal to its preceding byte.

Byte Up/Down Test

The sequence is divided into two-byte segments. The second byte in each segment is either equal to, greater than, or less than the first. The total number of segments in each of these three categories is counted. In a random sequence the probability that the second byte will be equal to the first is 1/256. The probabilities that second byte will be greater than or less than the first are both 255/512. A chi-squared statistic is calculated. The degrees-of-freedom is 2.

Two-Byte AND Test

The sequence is divided into two-byte segments. The two bytes in each segment bitwise ANDed together. The result will fall into one of 7 ranges: it will be less than 37, greater than or equal to 37 and less than 74, greater than or equal to 74 and less than 111, greater than or equal to 111 and less than 148, greater than or equal to 148 and less than 185, greater than or equal to 185 and less than 222, or greater than or equal to 222. In a random sequence the probabilities of these ranges are 32265/65536, 10755/65536, 5355/65536, 8985/65536, 3969/65536, 3171/65536, and 1036/65536; respectively. The total number of segments in each of these 7 ranges is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 6.

Four-Byte-Run-Up Test

The sequence is divided into four-byte segments. If the first byte of the segment is less than the second, and the second byte is less than the third, and the third is less than the fourth, a four-byte-run-up is said to have occurred. In a random sequence the probability of a four-byte-run-up is 2,731,135/67,108,864. The total number of four-byte-run-ups is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

Byte Collision Test

The sequence is divided into segments. The length of the segments, in bytes, is a parameter passed to the function. If any byte value occurs more than once in a segment, a collision is said to have occurred. In a random sequence, the probability that no collisions will occur in a segment is (255/256) × (254/256) × (253/256) × ... × ((257 - x)/256), where x is the segment length in bytes. The total number of segments with no collisions is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

Offset Byte XOR Test

The sequence is divided into pairs of bytes. The offset in the sequence between the bytes in each pair is a parameter passed to the function. The two bytes in each pair are bitwise XORed together. The total number of resulting bytes of each possible value is counted. In a random sequence the probability of each possible byte value is 1/256. A chi-squared statistic is calculated. The degrees-of-freedom is 255.

Multiple Test

For a given positive integer (x), each byte in the sequence either is or is not a multiple of x. For the purposes of this test zero is considered to be a multiple of all positive integers. In a random sequence the probability that a byte is a multiple of x is (256/x rounded up to the next integer)/256. The value of x is a parameter passed to the function. The total number of byte values that are and are not multiples is counted. A chi-squared statistic is calculated. The degrees-of-freedom is 1.

2. Summary of results for VMPC Stream Cipher

the key-generating-LCG seed = 1578566452
the test sequence length = 32 MB
--------------------------------------------------------------------
VMPC cumulative sums, 256 (64-byte) keys
speed = 30744.17 KB/second

 the bit runs statistic =  2570.25461 [P = 0.4394] the alt. 1 bit runs statistic =  2590.11146 [P = 0.3341] the alt. 2 bit runs statistic =  2515.10664 [P = 0.7330] the alt. 3 bit runs statistic =  2562.81784 [P = 0.4806] the alt. 4 bit runs statistic =  2517.39078 [P = 0.7222] the alt. 5 bit runs statistic =  2599.78142 [P = 0.2869] the block bit freq. statistic = 65524.07658 [P = 0.5124] the bit frequency statistic =   253.70815 [P = 0.5287] the tidbit frequency statistic =   771.62027 [P = 0.4565] the nibble frequency statistic =  3835.27413 [P = 0.5185] the byte frequency statistic = 66049.24387 [P = 0.0169]* the bit 0 statistic =   261.16729 [P = 0.3989] the bit 1 statistic =   248.15910 [P = 0.6257] the bit 2 statistic =   257.46484 [P = 0.4625] the bit 3 statistic =   274.18469 [P = 0.2076] the bit 4 statistic =   243.05384 [P = 0.7097] the bit 5 statistic =   227.38298 [P = 0.9007]^ the bit 6 statistic =   216.80470 [P = 0.9640]^ the bit 7 statistic =   256.78343 [P = 0.4745] the overall bit statistic =  1985.00087 [P = 0.8375] the 8-bit-sum statistic =  2046.82052 [P = 0.5032] the 16-bit-sum statistic =   517.85981 [P = 0.4195] the 32-bit-sum statistic =   520.30721 [P = 0.3901] the 64-bit matrix statistic = 65484.50275 [P = 0.2853] the 256-bit matrix statistic = 65736.20866 [P = 0.1036] the bit prediction A statistic =   273.38506 [P = 0.2174] the bit prediction B statistic =   268.72645 [P = 0.2800] the bit prediction C statistic =   218.62813 [P = 0.9563]^ the bit prediction D statistic =   254.31397 [P = 0.5180] the byte prediction A statistic =   259.14681 [P = 0.4333] the byte prediction B statistic =   240.20226 [P = 0.7528] the byte prediction C statistic =   258.71543 [P = 0.4408] the byte repetition statistic =   252.90786 [P = 0.5429] the byte up/down statistic =   531.91144 [P = 0.2627] the 4-byte collision statistic =   276.40638 [P = 0.1819] the 8-byte collision statistic =   242.65747 [P = 0.7159] the 16-byte collision statistic =   236.45307 [P = 0.8043] the 32-byte collision statistic =   269.46726 [P = 0.2695] the 1-byte off. xor statistic = 64933.46564 [P = 0.8312] the 2-byte off. xor statistic = 65265.87329 [P = 0.5149] the 6-byte off. xor statistic = 65294.45798 [P = 0.4833] the 24-byte off. xor statistic = 65265.68607 [P = 0.5151] the 120-byte off. xor statistic = 65261.43640 [P = 0.5198] the 720-byte off. xor statistic = 65361.79626 [P = 0.4098] the 5040-byte off. xor stat. = 65217.32770 [P = 0.5681] the 40320-byte off. xor stat. = 65621.86508 [P = 0.1720] the 362880-byte off. xor stat. = 65737.99878 [P = 0.1027] the multiple of 3 statistic =   229.41963 [P = 0.8826] the multiple of 5 statistic =   240.02398 [P = 0.7554] the multiple of 7 statistic =   230.95702 [P = 0.8676] the multiple of 11 statistic =   281.74466 [P = 0.1290] the multiple of 13 statistic =   254.97440 [P = 0.5063] the multiple of 17 statistic =   239.15484 [P = 0.7679] the multiple of 19 statistic =   243.21746 [P = 0.7071]
--------------------------------------------------------------------

3. Author's comment on the results

The test results on the VMPC pseudo-random generator do NOT suggest that the sequences produced by the generator are non-random.

To find out more or contact David Sexton, visit www.geocities.com/da5id65536

by David Sexton

Copyright © 1999-2018 by Bartosz Zoltak