At work on Friday I came across an interesting math problem. I needed a pseudo random bit stream, and I needed the generator function to be fast. The typical random number function is a Linear congruential generator, most often f(x) = (1103515245 * x + 12345) mod 2^{32}-1. The modulus remainder is implied—the arithmetic is simply allowed to overflow. Multiplies are usually expensive in terms of CPU cycles. A faster generator is a linear feedback shift register (LFSR), which are comprised of shifts and XORes. I've used LFSR a number of times when I needed random data, without looking too much at the "random" stream of data it produced.

Using a 32-bit Galois LFSR with the polynomial x^{32} + x^{31} + x^{29} + x + 1, and a seed of 0x12345678 produced this output:

2B3C091A 159E048D 8ACF0246 4566D123

A2B36891 D158E448 68AC7224 34563912

1A2B1C89 8D14DE44 468A6F22 23453791

91A3CBC8 48D0B5E4 24685AF2 12342D79

891A16BC 448D0B5E 2247D5AF 9122BAD7

C8915D6B E449FEB5 F225AF5A 791387AD

BC8893D6 5E4449EB AF2224F5 D791127A

6BC9D93D B5E5BC9E 5AF38E4F AD789727

D6BC4B93 EB5E25C9 F5AF12E4 7AD6D972

3D6B6CB9 9EB4E65C 4F5A732E 27AD3997

93D7CCCB C9EAB665 E4F55B32 727BFD99

B93CAECC 5C9E5766 2E4F2BB3 9726C5D9

CB9362EC 65C8E176 32E470BB 9972385D

CCB91C2E 665DDE17 B32FBF0B D9968F85

ECCB47C2 7664F3E1 BB3279F0 5D993CF8

2ECDCE7C 1767B73E 0BB28B9F 85D945CF

C2EDF2E7 E177A973 F0BA84B9 F85D425C

7C2FF12E 3E16A897 9F0B544B CF84FA25

E7C27D12 73E13E89 B9F1CF44 5CF9B7A2

2E7D8BD1 973F95E8 4B9E9AF4 25CF4D7A

12E6F6BD 89737B5E 44B8EDAF A25C76D7

D12E3B6B E8971DB5 F44ADEDA 7A256F6D

BD13E7B6 5E88A3DB AF4451ED D7A228F6

6BD1147B B5E9DA3D DAF5BD1E 6D7B8E8F

B6BC9747 DB5E4BA3 EDAF25D1 F6D6C2E8

If you haven't noticed, the pattern "scrolls" digits to the right. This is what one would expect—this is a "shift" register after all. Each number is different, and the period between any two number (except for zero) is 2^{32}. However, the pattern between each successive number is fairly obvious. This is because of the polynomial. The shift register only introduces bits 31, 29, 1 and 0 when clocked. Otherwise, output is simply shifted to the right by one. I've arranged the data in 4 columns because each digit (nibble) is 4 bits, so after 4 bits, the nibble is completely shifted right by one.

While this pattern probably would work fine for what I was trying to do, I thought there must be a better solution. A polynomial with more taps should produce a better random stream. But how exactly does one go about generating a polynomial for a shift register?

What is need is a irreducible primitive polynomial. One can put any polynomial (i.e. any number as the polynomial term) into a LFSR they like, but it may not (probably won't) produce a maximum length shift register. That is, a 32-bit word has 2^{32} possibilities. With the correct polynomial, the output of a 32-bit LFSR will produce 2^{32} unique outputs before repeating. Without the correct polynomial, one of two things will happen. The output will happen after less then 2^{32} iterations. Or the output will never repeat and some smaller sub-sequence will repeat.

The polynomial of a LFSR is XORed in with the shifted data. My thought was, if I wanted a better random pattern, we'd better XOR more bits. And not just more bits, but spread out more or less evenly. This should change the stream of data quite a bit.

So how does one create an irreducible primitive polynomial? Good question. I still don't have a complete solution, but I do have part of the solution. I found a program written by a Scott Duplichan that generates primitive polynomials—although not necessarily irreducible. It's designed to generate huge primitive polynomials—on the order of thousands of bits. However, it also would do 32-bit. What was nice about this program was that I could tell it the "weight" I wanted. That is, how many bits I wanted set. The number has to be odd, so I chose to generate polynomials with 15 bits.

There are a lot of primitive polynomials with a weight of 15—millions. And most of them did not have bit patterns that were uniformly distributed. For example, one of the first polynomials produced was 0x00041FFF. While there are 15-bits set, there are pretty much all on one side. The program didn't have a way to tell it to generate polynomials that had uniform distribution, but it did have the ability to find a set number of polynomials. That was good enough for me. I set the program off to find ten million primitive polynomials, and went off to do something else. Just over an hour latter (4152 seconds actually), I had 10 million primitive polynomials with 15-bits set.

The next task was to find polynomials with a more uniform bit distribution. To do this, I wiped up a filter program to only save polynomials that had, at most, 3-bits of the same bits in a row. This significantly reduced the number of polynomials. But I still did not have a list of irreducible primitive polynomials. I started reading about how to test for this, but it was taking too much time. It would be easier to brute force this problem. I wanted a polynomial that was maximum length—that is, produced 2^{32} unique values before repeating. Any polynomial that did this was what I looking for.

It only takes a few seconds to brute-force check all 2^{32} possibilities for a LFSR. With the bit distribution restrictions, there were only 9,441 of the original ten million polynomials to check. So, I launched a brute-force attack, and went home for the weekend.

When I came back this morning, I had a list of 40 irreducible primitive polynomials. I'm not sure how long it took to complete this search, but that was irrelevant—I had my data.

Forty was still more polynomials then I needed. So I decided to reduce the list a little further. I just wanted prime polynomials. This is an other brute-force test. Any number is prime if all none of the prime number up to the square root of the number divide evenly into it. My list was reduced from 40 to 10. I picked the first number from the list, but any would have worked.

Again, my 32-bit Galois LFSR, this time with the polynomial 0x19253292B, and a seed of 0x12345678 produced this output:

2B3C091A 159E048D 8ACF0246 6C4C9370

362649B8 1B1324DC 24A2803D 9251401E

6003B25C 192ACB7D 8C9565BE 6F61A08C

1E9BC215 A666F359 D33379AC 40B2AE85

A0595742 7907B9F2 15A8CEAA 0AD46755

856A33AA 42B519D5 88719EB9 ED13DD0F

DFA2FCD4 6FD17E6A 1EC3AD66 264AC4E0

13256270 20B9A36B B977C3E6 7590F3A0

3AC879D0 1D643CE8 0EB21E74 07590F3A

2A8795CE 3C68D8B4 1E346C5A 0F1A362D

878D1B16 6AED9FD8 1C5DDDBF A705FC8C

7AA9EC15 947FE459 E314E07F F18A703F

F8C5381F D5498E5C 438FD57D 88ECF8ED

C4767C76 623B3E3B 98368D4E 4C1B46A7

8F26B100 47935880 0AE2BE13 85715F09

EB93BDD7 DCE2CCB8 6E71665C 1E13A17D

A622C2ED D3116176 40A3A2E8 097AC327

84BD6193 EB75A29A 5C91C31E 0763F3DC

2A9AEBBD 954D75DE 638DA8BC 18EDC60D

A55DF155 FB85EAF9 D4E9E72F C35FE1C4

4884E2B1 A4427158 522138AC 003B8E05

A936D551 D49B6AA8 4366A707 A1B35383

F9F2BB92 7CF95DC9 9757BCB7 E280CC08

71406604 38A03302 1C501981 8E280CC0

47140660 238A0330 11C50198 21C9929F

And this looks much better. But just because you can't see a pattern doesn't mean one doesn't exist. In fact, one obviously exists—the data is from an equation. But without the equation, how good does the random data look? The quickest way to find out if data is random is to try and compress it. If it compresses, it isn't very random. For example, with 256 MBs of data from the original polynomial, we gets a stream that 7-zip could compress 54% of it's original size. With the new polynomial, 7-zip was only able to compress it to 99.964% of it's original size.

Compression isn't the best judge of a file's randomness. I knew there were better algorithms out for generating an entropy number. Doing a quick search, I found this program originally written by John Walker—one of the co-founders of AutoCAD. It dates back to 1985, but the nice thing about math (unlike computer hardware) is it still works fine after 25 years.

The program produces the chi-square distribution, which is given as percentage. When this percentage is close (or near) to 0% or %100, the data stream is said to not be very random—even 95% or 5% are suspect. The chi-square distribution of the first data set is 0.01%, or not even close to random. The chi-square distribution for the new polynomial is 38.58%. For comparison, the chi-square distribution of some output of the XTEA encryption algorithm in a feedback loop had chi-square distribution of 26.5%. Encryption algorithms desire to produce highly random output.

So our new polynomial of 0x9253292B produces a good psudo-random data stream when used in a 32-bit LFSR.