Erlang, Euler, Primes, and gen_server

I have been working on Project Euler problems for a while now and many of them have centered around prime numbers. I've referenced my work with the sieve in other posts but found with a particular problem that some of my functions could benefit from some state being saved (namely the sieve being saved and not re-computed each time). The problem called to count prime factors of numbers and find consecutive numbers that had the same count of prime factors. My primes module had a prime\_factors/1 function that would compute the prime factors and the exponents of those factors (so 644 = 2² * 7 * 23, and primes:prime\_factors(644) would return [{2,2},{7,1},{23,1}]. The prime\_factors/1 looked something like this:

prime_factors(N) ->
    CandidatePrimes = prime_factors(N, primes:queue(N div 2)),
    PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
    find_factors(PrimeFactors, N, 0, []).

The call to find\_factors/4 takes the factors and finds the exponents and can be ignored for now. The time sink, then, is in generating the CandidatePrimes list. I think my primes:queue/1 function is pretty fast at generating the sieve, and dividing N by two eliminates a lot of unnecessary computation, but when you're calling prime\_factors/1 thousands of times, the call to queue/1 begins to add up. This is where I needed to save some state (the sieve) in between calls. Erlang, fortunately enough, has a module behavior called gen\_server that abstracts away a lot of the server internals and lets you focus on the business bits of the server. I won't discuss it much here as I'm not an authority on it, but Joe Armstrong's book and the Erlang docs have been a great help in understanding what's happening behind the scene. You can view the prime\_server module to see what its current state is code-wise. To speed up prime\_factors/1, I split it into two functions prime\_factors/1 and prime\_factors/2. The functions look like this:

prime_factors(N, CandidatePrimes) ->
    PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
    find_factors(PrimeFactors, N, 0, []).

prime_factors(N) ->
    prime_factors(N, primes:queue(N div 2)).

Now, if we don't need to save the queue between calls you can still call prime\_factors/1 as usual. The prime\_server module utilizes the prime\_factors/2 function because it initializes its state to contain a primes sieve (either all primes under 1,000,000 if no arg is passed to start\_link, or all primes =< N when using start\_link/1) and the current upper bound. Now when the server handles the call for getting the factors, we pass a pared down list of primes to prime\_factors/2 and get a nice speed boost. Well, the heavy lifting is front-loaded in the initialization of the server (generating the sieve) and in calls that increase the sieve's size. One improvement there might be to save the Table generated during the initial sieve creation and start the loop back up from where it left off (when N > UpTo) but that is for another time. If you choose your initial value for start\_link right, regenerating the sieve should be unnecessary. The last speed boost was noticing that calculating the exponents was an unnecessary step so I wrote a count\_factors/1 and count\_factors/2 that skips the call to find\_factors/4 and returns the length of the list comprehension. With these changes complete, problem 47 went from taking well over 5 minutes to just under 20 seconds to solve brute force.

Chewing the Cud

Erlang, Euler, Primes, and gen_server