I have been working on [[http://projecteuler.net][Project Euler]]
problems for a while now and many of them have centered around prime
numbers. I've referenced my work with the sieve in
[[http://jamesaimonetti.com/2009/12/16/erlang-primes-and-the-sieve-of-eratosthenes/][other]]
[[http://jamesaimonetti.com/2009/12/17/erlang-and-primes-again/][posts]]
but found with a particular
[[http://projecteuler.net/index.php?section=problems&id=47][problem]]
that some of my functions could benefit from some state being saved
(namely the sieve being saved and not re-computed each time). The
problem called to count prime factors of numbers and find consecutive
numbers that had the same count of prime factors. My
[[http://github.com/jamesaimonetti/ProjectEuler/blob/master/primes.erl][primes
module]] had a prime\_factors/1 function that would compute the prime
factors and the exponents of those factors (so 644 = 2^{2} * 7 * 23, and
primes:prime\_factors(644) would return [{2,2},{7,1},{23,1}]. The
prime\_factors/1 looked something like this:

#+BEGIN_EXAMPLE
    prime_factors(N) ->
        CandidatePrimes = prime_factors(N, primes:queue(N div 2)),
        PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
        find_factors(PrimeFactors, N, 0, []).
#+END_EXAMPLE

The call to find\_factors/4 takes the factors and finds the exponents
and can be ignored for now. The time sink, then, is in generating the
CandidatePrimes list. I think my primes:queue/1 function is pretty fast
at generating the sieve, and dividing N by two eliminates a lot of
unnecessary computation, but when you're calling prime\_factors/1
thousands of times, the call to queue/1 begins to add up. This is where
I needed to save some state (the sieve) in between calls. Erlang,
fortunately enough, has a module behavior called
[[http://www.erlang.org/doc/man/gen_server.html][gen\_server]] that
abstracts away a lot of the server internals and lets you focus on the
business bits of the server. I won't discuss it much here as I'm not an
authority on it, but
[[http://www.pragprog.com/titles/jaerlang/programming-erlang][Joe
Armstrong's book]] and the Erlang docs have been a great help in
understanding what's happening behind the scene. You can view the
[[http://github.com/jamesaimonetti/ProjectEuler/blob/master/prime_server.erl][prime\_server
module]] to see what its current state is code-wise. To speed up
prime\_factors/1, I split it into two functions prime\_factors/1 and
prime\_factors/2. The functions look like this:

#+BEGIN_EXAMPLE
    prime_factors(N, CandidatePrimes) ->
        PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
        find_factors(PrimeFactors, N, 0, []).
     
    prime_factors(N) ->
        prime_factors(N, primes:queue(N div 2)).
#+END_EXAMPLE

Now, if we don't need to save the queue between calls you can still call
prime\_factors/1 as usual. The prime\_server module utilizes the
prime\_factors/2 function because it initializes its state to contain a
primes sieve (either all primes under 1,000,000 if no arg is passed to
start\_link, or all primes =< N when using start\_link/1) and the
current upper bound. Now when the server handles the call for getting
the factors, we pass a pared down list of primes to prime\_factors/2 and
get a nice speed boost. Well, the heavy lifting is front-loaded in the
initialization of the server (generating the sieve) and in calls that
increase the sieve's size. One improvement there might be to save the
Table generated during the initial sieve creation and start the loop
back up from where it left off (when N > UpTo) but that is for another
time. If you choose your initial value for start\_link right,
regenerating the sieve should be unnecessary. The last speed boost was
noticing that calculating the exponents was an unnecessary step so I
wrote a count\_factors/1 and count\_factors/2 that skips the call to
find\_factors/4 and returns the length of the list comprehension. With
these changes complete,
[[http://github.com/jamesaimonetti/ProjectEuler/blob/master/p47.erl][problem
47]] went from taking well over 5 minutes to just under 20 seconds to
solve brute force.