Erlang and Webmachine

I'm currently working on a small startup project, for one to meet a need of some acquaintances, but more importantly to learn me some Erlang with regards to the web. While I'm further along than I actually expected to be, I thought I'd begin documenting the steps I've taken towards building this app. The current nerdities I'm using:

Installation of all of these on a GNU/Linux system is pretty straightforward, so I won't cover that here. Defaults were used for Erlang. I installed the other libraries/applications in ~/dev/erlang/lib and pointed $ERL\_LIBS there in my .bashrc. I did follow this guide for setting up Tsung. The BeeBole site has several other pages worth reading for developing web applications in Erlang. Once installed, build the webmachine project:

    $WEBMACHINE_HOME/scripts/new_webmachine.erl wm_app /path/to/root
    cd /path/to/roow/wm_app
    make
    ./start.sh

You now have a working project! Of course, I like to have my Erlang shell inside of emacs while I'm developing, so I added a comment to the start.sh script that contained the shell parameters. My start.sh looks like this:

    #!/bin/sh

    # for emacs C-c C-z flags:
    # -pa ./ebin -pa ./priv/templates/ebin -boot start_sasl -s wm_app

    cd `dirname $0`
    exec erl -pa $PWD/ebin $PWD/deps/*/ebin $PWD/deps/*/deps/*/ebin $PWD/priv/templates/ebin -boot start_sasl -s wm_app

I currently have all of my dependencies in $ERL\_LIBS; when I deploy this to production, I'll add the libs to the wm\_app/deps as either a symlink or copied into the directory. To have the custom shell means you need the .emacs code to start an Erlang shell with custom flags. Important note: If you need to specify multiple code paths in the -pa arg, you have to use a -pa for each path, unlike in the shell command version where any path after the -pa (or -pz) is added. Another caveat: when starting the Erlang shell within emacs, if you're currently in a erlang-related buffer (.erl, .hrl, etc), the default shell is started without the option to set flags. I typically have the start.sh open anyway to copy the flags so I don't run into this much anymore; I'm documenting it here just in case anyone stumbles on it. Now you have a shell within which to execute commands against your webmachine app, load updated modules, etc. Coming up, I'll talk about how I'm using ErlyDTL to create templates and using CouchDB/Couchbeam for the document store.

Erlang, Euler, Primes, and gen_server

I have been working on Project Euler problems for a while now and many of them have centered around prime numbers. I've referenced my work with the sieve in other posts but found with a particular problem that some of my functions could benefit from some state being saved (namely the sieve being saved and not re-computed each time). The problem called to count prime factors of numbers and find consecutive numbers that had the same count of prime factors. My primes module had a prime\_factors/1 function that would compute the prime factors and the exponents of those factors (so 644 = 22 * 7 * 23, and primes:prime\_factors(644) would return [{2,2},{7,1},{23,1}]. The prime\_factors/1 looked something like this:

    prime_factors(N) ->
        CandidatePrimes = prime_factors(N, primes:queue(N div 2)),
        PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
        find_factors(PrimeFactors, N, 0, []).

The call to find\_factors/4 takes the factors and finds the exponents and can be ignored for now. The time sink, then, is in generating the CandidatePrimes list. I think my primes:queue/1 function is pretty fast at generating the sieve, and dividing N by two eliminates a lot of unnecessary computation, but when you're calling prime\_factors/1 thousands of times, the call to queue/1 begins to add up. This is where I needed to save some state (the sieve) in between calls. Erlang, fortunately enough, has a module behavior called gen\_server that abstracts away a lot of the server internals and lets you focus on the business bits of the server. I won't discuss it much here as I'm not an authority on it, but Joe Armstrong's book and the Erlang docs have been a great help in understanding what's happening behind the scene. You can view the prime\_server module to see what its current state is code-wise. To speed up prime\_factors/1, I split it into two functions prime\_factors/1 and prime\_factors/2. The functions look like this:

    prime_factors(N, CandidatePrimes) ->
        PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
        find_factors(PrimeFactors, N, 0, []).

    prime_factors(N) ->
        prime_factors(N, primes:queue(N div 2)).

Now, if we don't need to save the queue between calls you can still call prime\_factors/1 as usual. The prime\_server module utilizes the prime\_factors/2 function because it initializes its state to contain a primes sieve (either all primes under 1,000,000 if no arg is passed to start\_link, or all primes =< N when using start\_link/1) and the current upper bound. Now when the server handles the call for getting the factors, we pass a pared down list of primes to prime\_factors/2 and get a nice speed boost. Well, the heavy lifting is front-loaded in the initialization of the server (generating the sieve) and in calls that increase the sieve's size. One improvement there might be to save the Table generated during the initial sieve creation and start the loop back up from where it left off (when N > UpTo) but that is for another time. If you choose your initial value for start\_link right, regenerating the sieve should be unnecessary. The last speed boost was noticing that calculating the exponents was an unnecessary step so I wrote a count\_factors/1 and count\_factors/2 that skips the call to find\_factors/4 and returns the length of the list comprehension. With these changes complete, problem 47 went from taking well over 5 minutes to just under 20 seconds to solve brute force.

Purely Functional Data Structures and Me

Quick post to say that I've put my dabblings into Chris Okasaki's Purely Functional Data Structures book up as a github repo. I am picking and choosing what exercises and examples to code, with the goal being to slowly cover the whole book. Some concessions are made to fit the ideas into Erlang (like recursively calling an anonymous function), but overall I think the ideas fit nicely. There is a streams lib from Richard Carlsson that I found on the Erlang mailing list in the repo as well that I used for reference for a couple things in chapter 4. I stuck with my streams being represented either as an empty list [] or [term() | fun()] with term() being the calculated value and fun() being the suspension function, instead of the tuple that Richard chose. After reading further in the thread, I understand why (don't confuse people that might use proper list functions on streams) but for my small examples, it was enough to use lists.

Problem Solved

In the comments section of a recent Atwood post, commentor Paul Jungwirth (search for the name as I can't find comment permalinks) posted about a problem from a perl mailing list that he would give potential hires. This post is not about the blog post but about the problem from the comments section. The Problem (from the mailing list): Consider the following string: 1 2 3 4 5 6 7 8 9 = 2002 The problem is to add any number of addition & multiplication operations wherever you'd like on the left such that in the end you have a valid equation. So for example if it gets you to a solution you can have: 12 * 345 + 6 … if that works as part of your solution [it's much too big: 4146]. Bearing in mind that multiplication takes higher precedence than addition, what is the solution? My answer generator can be found here in Erlang. I liked the problem because it is in the vein of Project Euler. The eval/1 is a slightly modified version of this one on TrapExit.

PHP, cURL, and POST

While working on a script today that had been working, I couldn't for the life of me figure out why it was failing. It uses the PHP curl\_* functions to make various requests and processes the results. Turns out when you send a POST body with the CURLOPT\_POSTFIELDS and a value field begins with an at symbol(@), you have to escape it (\@). The reason is the at symbol is used by curl to denote a file upload path ("@/path/to/upload.file"). So escape the at symbol and you should be back to good with the curling.

Adding Files To Subversion

Working with symfony, especially when adding to the schema and generating the model, form, and filter classes, it becomes tedious to add each of the new files to your subversion repository. Here's a succinct line to add all un-versioned files to your repo:

    #!/bin/sh

    svn add `svn st | grep ? | head | awk '{print $2}'`

The key is in the tick-marked section. It takes the output of svn st(atus) and pipes it to grep, selecting only the un-versioned files (denoted by the ?), pipes that to head which outputs the first 10 (by default) lines, and pipes that to awk which prints the second column containing the file path to be added. But what if you have more than 10 files to add? You can easily pass a -n NUM switch to the head command to increase the number of lines it reads in at a time. I'll leave it as an exercise to the reader to modify the script to allow a user to pass in what NUM should be. So when your "svn st" output is filled with un-versioned files, all of which you need, run this little guy and have it done speedily.

More Erlang+Emacs

I have found that Distel's built-in shell launcher wasn't cutting mustard as I needed to start shells with various flags and didn't see an easy way to accomplish this using what Distel provided. Digging around the Erlang mailing list, I found an elisp function that allowed me to pass flags to the shell. Place this snippet in your .emacs file after you've required erlang and distel:

    (defun erl-shell-with-flags (flags)
      "Start an erlang shell with flags"
      (interactive (list (read-string "Flags: ")))
      (set 'inferior-erlang-machine-options (split-string flags))
      (erlang-shell))

    ;; map Ctrl-c Ctrl-z to the new function
    (global-set-key "\C-c\C-z" 'erl-shell-with-flags)

Now when you start the erlang shell, a "Flags: " prompt will be presented. Simply add flags as you would on the command line and the shell will start up. Great for when you need multiple shells with different snames, names, cookies, etc…

Connect to remote erlang shell while inside emacs

While developing my top secret project, I have been getting into the fun stuff in Erlang and Emacs. Connecting to a running instance of my app from a remote shell wasn't straightforward to me at first, so below is my documented way of connecting, as well as dropping into the Erlang JCL from within an Emacs erlang shell.

  1. Start yaws: yaws –daemon -sname appname –conf /path/to/yaws.conf
  2. Start emacs, and from within emacs start an Erlang shell with C-c C-z (assuming you have distel configured).
  3. From the Emacs erlang shell, get into Erlang's JCL by typing C-q C-g and pressing enter. A \^G will be printed at the prompt, but won't be evaluated until you press enter. You should see the familiar JCL prompt "User switch command –>".
  4. Type 'j' to see current jobs you have running locally, which is probably just the current shell (1 {shell,start,[init]}).
  5. Type 'r appname@compy' to connect to the remote node identified by appname ( from the -sname parameter ) on the computer compy (usually whatever hostname returns)
  6. Type 'j' to see current jobs, which should list your current shell as "1 {shell,start,[init]}", and a second shell "2* {appname@compy,shell,start,[]}".
  7. Type 'c 2' to connect to the remote shell. You can now run commands in the node's shell. You may have to press enter again to bring up a shell prompt.
    james@compy 14:33:34 ~/dev/erlang/app
    > yaws --daemon -sname app --conf config/yaws.conf

    james@compy 14:34:00 ~/dev/erlang/app
    > emacs
    Eshell V5.7.4  (abort with ^G)
    1> ^G

    User switch command
     --> j
       1* {shell,start,[init]}
     --> r app@compy
     --> j
       1  {shell,start,[init]}
       2* {app@compy,shell,start,[]}
     --> c 2

    1>

PHP's json_last_error

A quick note that I hope Google picks up concerning php's json\_last\_error function. I was trying to debug a json string I was decoding with json\_decode, but was getting NULL. When I tried to use the json\_last\_error(), a fatal undefined function error was returned. The reason: json\_last\_error doesn't exist in php versions < 5.3. Ah, version numbers! So, check your php version if the function is undefined. Simple, yet a detail easily overlooked.