I have been working on Project Euler
problems for a while now and many of them have centered around prime
numbers. I've referenced my work with the sieve in
other
posts
but found with a particular
problem
that some of my functions could benefit from some state being saved
(namely the sieve being saved and not re-computed each time). The
problem called to count prime factors of numbers and find consecutive
numbers that had the same count of prime factors. My
primes
module had a prime\factors/1 function that would compute the prime
factors and the exponents of those factors (so 644 = 22 * 7 * 23, and
primes:prime\factors(644) would return [{2,2},{7,1},{23,1}]. The
prime\factors/1 looked something like this:
prime_factors(N) ->
CandidatePrimes = prime_factors(N, primes:queue(N div 2)),
PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
find_factors(PrimeFactors, N, 0, []).
The call to find\factors/4 takes the factors and finds the exponents
and can be ignored for now. The time sink, then, is in generating the
CandidatePrimes list. I think my primes:queue/1 function is pretty fast
at generating the sieve, and dividing N by two eliminates a lot of
unnecessary computation, but when you're calling prime\factors/1
thousands of times, the call to queue/1 begins to add up. This is where
I needed to save some state (the sieve) in between calls. Erlang,
fortunately enough, has a module behavior called
gen\server that
abstracts away a lot of the server internals and lets you focus on the
business bits of the server. I won't discuss it much here as I'm not an
authority on it, but
Joe
Armstrong's book and the Erlang docs have been a great help in
understanding what's happening behind the scene. You can view the
prime\server
module to see what its current state is code-wise. To speed up
prime\factors/1, I split it into two functions prime\factors/1 and
prime\factors/2. The functions look like this:
prime_factors(N, CandidatePrimes) ->
PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
find_factors(PrimeFactors, N, 0, []).
prime_factors(N) ->
prime_factors(N, primes:queue(N div 2)).
Now, if we don't need to save the queue between calls you can still call
prime\factors/1 as usual. The prime\server module utilizes the
prime\factors/2 function because it initializes its state to contain a
primes sieve (either all primes under 1,000,000 if no arg is passed to
start\link, or all primes =< N when using start\link/1) and the
current upper bound. Now when the server handles the call for getting
the factors, we pass a pared down list of primes to prime\factors/2 and
get a nice speed boost. Well, the heavy lifting is front-loaded in the
initialization of the server (generating the sieve) and in calls that
increase the sieve's size. One improvement there might be to save the
Table generated during the initial sieve creation and start the loop
back up from where it left off (when N > UpTo) but that is for another
time. If you choose your initial value for start\link right,
regenerating the sieve should be unnecessary. The last speed boost was
noticing that calculating the exponents was an unnecessary step so I
wrote a count\factors/1 and count\factors/2 that skips the call to
find\factors/4 and returns the length of the list comprehension. With
these changes complete,
problem
47 went from taking well over 5 minutes to just under 20 seconds to
solve brute force.