Programming and general geekiness.

Archive for November, 2011

Generating primes with the Sieve of Atkin in C++

Those that read my blogs regularly will know that I’ve had a bit of a recent obsession with prime numbers. I am really not sure why, but I have been doing a lot of programming with them. My original method had been to use modulus and check if a number was divisible by any already known prime numbers. I wrote this in Java and it it took an hour to generate all the primes up to 10,000,000. My next approach was to write a small Python program that used the Sieve of Eratosthenes where by you loop through the multiples of each number marking them as non-prime. This method works well but it is still quite slow.

My next approach, therefore, is to use the Sieve of Atkin. This is a relatively modern method that is highly efficient. Here I have implemented it in C++:

#include <iostream>
#include <cmath>
#include <fstream>
using namespace std;
int main (int argc, char* argv[])
{
 //Create the various different variables required
 int limit = 1000000;
 int root = ceil(sqrt(limit));
 bool sieve[limit];
 int primes[(limit/2)+1];
 int insert = 2;
 primes[0] = 2;
 primes[1] = 3;
 for (int z = 0; z < limit; z++) sieve[z] = false; //Not all compilers have false as the default boolean value
 for (int x = 1; x <= root; x++)
 {
 for (int y = 1; y <= root; y++)
 {
 //Main part of Sieve of Atkin
 int n = (4*x*x)+(y*y);
 if (n <= limit && (n % 12 == 1 || n % 12 == 5)) sieve[n] ^= true;
 n = (3*x*x)+(y*y);
 if (n <= limit && n % 12 == 7) sieve[n] ^= true;
 n = (3*x*x)-(y*y);
 if (x > y && n <= limit && n % 12 == 11) sieve[n] ^= true;
 }
 }
 //Mark all multiples of squares as non-prime
 for (int r = 5; r <= root; r++) if (sieve[r]) for (int i = r*r; i < limit; i += r*r) sieve[i] = false;
 //Add into prime array
 for (int a = 5; a < limit; a++)
 {
 if (sieve[a])
 {
 primes[insert] = a;
 insert++;
 }
 }
 //The following code just writes the array to a file
 ofstream file;
 char filename[100];
 sprintf(filename, "primes_%d.txt", limit);
 file.open(filename);
 for (int a = 0; a < insert; a++) file << primes[a] << ((a == insert-1) ? "" : "\n");
 file.close();
 cout << "Written to file.\n";
 return 0;
}

This method works incredibly well and is capable of producing primes in fractions of a seconds. Click here to download the source code.

Advertisements

The family tree of programming languages

The above map shows how programming languages have evolved and influenced each other. Try as I might, everything seemed to come back to either Fortran or Assembly. Most languages are, however, influenced by C, even if it is indirectly.

Sieving primes

When programming to calculate prime numbers you have a lot of options. The obvious (and slow) option is to make an array called primes and then loop up to a number, each time checking if the number in the loop (say x) can be divided by any of the primes that have already been calculated and if not add it to the array of primes. This method does work well, it is just very, very slow. I generated all the prime numbers up to 4,000,000 with a Java program running like this in just over an hour on a relatively slow computer. Using the sieve method on the same computer I was able to do it in ten seconds.

So how does sieving work? Say that you wanted to calculate all the prime numbers up to 10. You would mark 0-9 as True to state that they are prime. You would then take 2 and loop through all multiples of 2 marking them as false (not prime). This would then tell us that 4, 6 and 8 weren’t prime. We would then take 3 and mark all of its multiples as not prime (6 and 9). We could take four but we might as well check that it hasn’t been marked already so we can skip it. We don’t need to do five because clearly any multiples will be outside the array.

Here’s how we would do it in Python:

def primes(max):
	primes = [True] * max
	primes[1] = False
	for i in xrange(2, int(max/2)+1):
		if primes[i]:
			for x in xrange(2*i, max, i):
				primes[x] = False
	final = []
	for i in xrange(1, max):
		if primes[i]: final.append(i)
	return final

Let me go through that step by step:

  1. Create an array where all the numbers are marked as prime
  2. Set 1 as not prime
  3. Loop through all the values in the array – we only go half way because anything above max/2 will be greater than max
  4. We check that the algorithm currently thinks that it is prime
  5. We then loop through all multiples of that number marking them as not prime
  6. We create a new array
  7. We loop through the original array checking for primes and adding them to the new array
  8. Return the new array

Overall sieving primes is incredibly efficient if it is done properly and is far more effective than other techniques.