You get paid for this?

Spotted in some high-priced "expert"'s code:

switch ($retcode)
{
    case -1:
    case -3:
        if ($retcode==-1)
            log("SOME_CODE", "SOME MSG");
        else
            log("SOME_OTHER_CODE", "SOME OTHER MSG");
...

Resolving Dialyzer "Function foo/n has no local return" errors

Dialyzer is a great static analysis tool for Erlang and has helped me catch many bugs related to what types I thought I was passing to a function versus what actually gets passed. Some of the errors Dialyzer emits are rather cryptic at first (as seems commonplace in the Erlang language/environment in general) but after you understand the causes of the errors, the fix is easily recognized. My most common error is Dialyzer inferring a different return type that what I put in my -spec, followed by Dialyzer telling me the same function has no local return. An example:

foo.erl:125: The specification for foo:init/1 states that the function might also return {'ok',tuple()} but the inferred return is none()
foo.erl:126: Function init/1 has no local return

The init/1 function (for a gen\server, btw):

-spec(init/1 :: (Args :: list()) -> tuple(ok, tuple())).
init(_) ->
  {ok, #state{}}.

And the state record definition:

-record(state, {
  var_1 = {} :: tuple(string(), tuple())
  ,var_2 = [] :: list(tuple(string(), tuple()))
}).

Spot the error? In the record definition, var\1 is initialized to an empty tuple and var\2 is initialized to an empty list, yet the spec typing for the record does not take that into account. The corrected version:

-record(state, {
  var_1 = {} :: tuple(string(), tuple()) | {}
  ,var_2 = [] :: list(tuple(string(), tuple())) | []
}).

And now Dialyzer stops emitting the spec error and the no local return error.

IT Expo

Just returned from IT Expo West last night. Three days of learning, hob-nobbing, and talking myself hoarse about the awesomeness that is 2600hz. We got a decent writeup posted on TMC's site, met quite a few people, collected beaucoup business cards, and generally had a fun time hanging with the team. Super tired but ready to keep building the best hosted PBX software platform! Bonus: See Darren's awesome (yet mildly awkward) video interview! Also, VoIP service providers looking to offset calling costs for their business clients can look at PromoCalling as a way to compete with Google and Skype's free calling plans.

Still Kicking

I am still alive and well; just busy. I did write a blog entry for my company, 2600hz. More to come…eventually.

Dealing with CouchDB and Couchbeam

I'll put this disclaimer up right away: My experience with CouchDB and Couchbeam is limited so I am definitely missing out on some knowledge and understanding that would probably make things easier. I won't detail CouchDB's offerings here as I have noted above my limited insight into that world. I will try to comment on using Couchbeam to get documents into and out of CouchDB, as well as writing a layer to take Couchbeam's document format and cleaning it up into slightly prettier Erlang data structures (proplists of binaries just aren't fun to type). The first thing to do is to connect to your database and get the PIDs for the connection and the DB using the couchbeam\server:start\connection\link/{0,1} and couchbeam\db:open\or\create/2. Couchbeam allows you to name the DB PID so you don't have to carry it around with you; just pass {DB\NAME\AS\ATOM, "DB\NAME\AS\LIST"} as the second parameter to open\or\create/2. Now calls can use DB\NAME\AS\ATOM instead of the DB process's PID to communicate. The biggest stumbling block out of the gate is the datatypes used by Couchbeam. Look at the API docs and get familiar with the types you'll be expected to provide and receive from the calls to the database and views. A document is either a json\object() or a 2-tuple {DocID, Revision}. A json\object is a proplist of [{json\string(), json\term()}]. json\strings are simple, atom or binary. A json\term is an atom, binary, integer, float, a list of json\terms, or another json\object(). Practically, for my needs, all keys are binaries and the majority of values are either binaries or numeric. I do have one place where I need to store a nested json\object(); wrap it in a 1-tuple as the value and you should be good.

%% nesting a json_object() as a value in a json_object()
ToNest = [{< >, < >}],
[{< >, < >},
 {< >, { ToNest } }].

Saving a doc is pretty easy; create the proplist and pass it to couchbeam\db:save\doc/2. One big gotcha when saving a document was whether the < > key existed. If the \rev key was undefined, a big ugly bad\match error was thrown with all sorts of chaos going on around it.

Erlang and Webmachine Tres

In my first post, I talked about the basic technologies I'd be using, and how to get an interactive shell started in emacs that started up your webmachine application. In this third post, I'll talk a little about building a session handling module. This will hopefully familiarize you with using redirects and cookies in your webmachine resources and using Couchbeam to store and retrieve documents. The high-level usage for sessions I have currently is:

  1. User agent accesses a supported resource (probably a browser accessing the home page initially).
    • I am defining the is\authorized/2 function in all of my resource files that correspond to web pages, as well as the finish\request/2 function.
    • The second parameter in the functions is a record, currently defined as

      -record(context, {session}).
      
  2. In the is\authorized/2 function for public facing pages, it returns the tuple:

    is_authorized(RD, Context) ->
        {true, RD, Context#context{session=session:start(RD)}.
    
    
    • Within session:start/1, I extract the session cookie (if it exists) and retrieve the session document from CouchDB. If the session cookie isn't set, I create a new #session{} record with some defaults pre-filled.
    • #+BEGINEXAMPLE %% session.hrl -record(session, {'id', 'rev', userid, expires, created, storage=[]}).

      %% session.erl start(RD) -> getsessionrec(RD).

      getsessionid(RD) -> wrq:getcookievalue(?COOKIENAME, RD).

      %% getsessiondoc(request()) -> notfound | jsonobject() getsessiondoc(RD) -> case getsessionid(RD) of undefined -> notfound; SessDocId -> couchbeamdb:opendoc(?DB, SessDocId) end.

      %% getsessionrec(request()) -> sessionrec() getsessionrec(RD) -> case getsessiondoc(RD) of notfound -> new(); Doc -> fromcouchbeam(Doc) end.

      new() -> Now = calendar:datetimetogregorianseconds(calendar:localtime()), #session{created=Now, expires=?MAXAGE, 'id'=couchbeamutil:newuuid()}.

      #+ENDEXAMPLE

    • So first I pull out the session ID from the cookie header in get\session\id/1 which will return either undefined or the ID.
    • If the session ID is undefined, get\session\doc/1 returns the not\found atom, which is what Couchbeam returns when it can't find a document. If a session ID is found, I call couchbeam:open\doc/2 to retrieve the document from CouchDB.
    • Finally, if get\session\doc/1 returns not\found, I create a new session record calling new/0. If a document is returned, I call a function that transforms the proplist to the session record.
    • Note: Couchbeam will auto assign your '\id' if the document to be saved doesn't have one or it is undefined. While useful, for sessions it causes a problem as the auto-generated UUIDs are fairly sequential (not exactly but usually only the last four or five characters differ). Calling couchbeam\util:new\uuid() creates your random uuids.
    • I store the created time in seconds, as well as the expires time (30 minutes worth of seconds right now). This allows me to easily clear ended sessions from CouchDB with a view indexing the session documents by (created+expires).
  3. Now, as the resource request is processed, I can always access the session through the Context parameter.
  4. When the request is done processing, it is time to save the session and set the cookie. In wm\resource:finish\request/2 I have:

    finish_request(RD, Context) ->
        {true, wm_session:finish(Context#context.session, RD), Context}.
    

    And in session.erl, I define the finish/2 process:

    %% finish(#session(), request()) -> request().
    %%  save session and set cookie headers
    finish(S, RD) -> save_session_rec(S, RD).
    
    save_session_rec(S, RD) ->
        DateTime = calendar:local_time(),
        Now = calendar:datetime_to_gregorian_seconds(DateTime),
        case has_expired(S#session.created + S#session.expires) of
            true -> close(S, RD);
            false ->
                D = to_couchbeam(S#session{created=Now,expires=?MAX_AGE}),
                S1 = from_couchbeam(couchbeam_db:save_doc(?DB, D)),
                set_cookie_header(RD, S1, DateTime, ?MAX_AGE)
        end.
    
    close(S, RD) ->
        DateTime = calendar:local_time(),
        SessionDoc = to_couchbeam(S),
        couchbeam_db:delete_doc(?DB, SessionDoc),
        set_cookie_header(RD, S, DateTime, -1).
    
    set_cookie_header(RD, Session, DateTime, MaxAge) ->
        {CookieHeader, CookieValue} = mochiweb_cookies:cookie(?COOKIE_NAME, Session#session.'_id', [{max_age, MaxAge},
                                                                                                    {local_time, DateTime}]),
        wrq:set_resp_header(CookieHeader, CookieValue, RD).
    
    • So to finish a session, I check if the current session has expired, and if so close it down. Otherwise, I update the created and exipres properties of the session and save it to CouchDB, and then set the cookie headers to reflect the new times.
  5. For a resource needing valid authentication to be accessible, the is\authorized/2 is slightly different:

    %% wm_resource.erl
    is_authorized(RD, Context) ->
        S = session:start(RD),
        case session:is_authorized(S) of
        true -> {true, RD, Context#context{session=S}};
            false ->
                RD0 = wrq:do_redirect(true, RD),
                RD1 = wrq:set_resp_header("Location", "/login", RD0),
                {{halt, 307}, RD1, Context#context{session=S}}
        end.
    
    %% session.erl
    is_authorized(S) ->
        S#session.user_id =/= undefined andalso
            not has_expired(S#session.created + S#session.expires).
    
    • Obviously you can set your own redirect header. The {halt, 307} is a temporary redirect (since hopefully you've only forgotten to login or been away too long and your session expired).

Erlang and Webmachine

I'm currently working on a small startup project, for one to meet a need of some acquaintances, but more importantly to learn me some Erlang with regards to the web. While I'm further along than I actually expected to be, I thought I'd begin documenting the steps I've taken towards building this app. The current nerdities I'm using:

Installation of all of these on a GNU/Linux system is pretty straightforward, so I won't cover that here. Defaults were used for Erlang. I installed the other libraries/applications in ~/dev/erlang/lib and pointed $ERL\LIBS there in my .bashrc. I did follow this guide for setting up Tsung. The BeeBole site has several other pages worth reading for developing web applications in Erlang. Once installed, build the webmachine project:

$WEBMACHINE_HOME/scripts/new_webmachine.erl wm_app /path/to/root
cd /path/to/roow/wm_app
make
./start.sh

You now have a working project! Of course, I like to have my Erlang shell inside of emacs while I'm developing, so I added a comment to the start.sh script that contained the shell parameters. My start.sh looks like this:

#!/bin/sh

# for emacs C-c C-z flags:
# -pa ./ebin -pa ./priv/templates/ebin -boot start_sasl -s wm_app

cd `dirname $0`
exec erl -pa $PWD/ebin $PWD/deps/*/ebin $PWD/deps/*/deps/*/ebin $PWD/priv/templates/ebin -boot start_sasl -s wm_app

I currently have all of my dependencies in $ERL\LIBS; when I deploy this to production, I'll add the libs to the wm\app/deps as either a symlink or copied into the directory. To have the custom shell means you need the .emacs code to start an Erlang shell with custom flags. Important note: If you need to specify multiple code paths in the -pa arg, you have to use a -pa for each path, unlike in the shell command version where any path after the -pa (or -pz) is added. Another caveat: when starting the Erlang shell within emacs, if you're currently in a erlang-related buffer (.erl, .hrl, etc), the default shell is started without the option to set flags. I typically have the start.sh open anyway to copy the flags so I don't run into this much anymore; I'm documenting it here just in case anyone stumbles on it. Now you have a shell within which to execute commands against your webmachine app, load updated modules, etc. Coming up, I'll talk about how I'm using ErlyDTL to create templates and using CouchDB/Couchbeam for the document store.

Erlang, Euler, Primes, and gen_server

I have been working on Project Euler problems for a while now and many of them have centered around prime numbers. I've referenced my work with the sieve in other posts but found with a particular problem that some of my functions could benefit from some state being saved (namely the sieve being saved and not re-computed each time). The problem called to count prime factors of numbers and find consecutive numbers that had the same count of prime factors. My primes module had a prime\factors/1 function that would compute the prime factors and the exponents of those factors (so 644 = 22 * 7 * 23, and primes:prime\factors(644) would return [{2,2},{7,1},{23,1}]. The prime\factors/1 looked something like this:

prime_factors(N) ->
    CandidatePrimes = prime_factors(N, primes:queue(N div 2)),
    PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
    find_factors(PrimeFactors, N, 0, []).

The call to find\factors/4 takes the factors and finds the exponents and can be ignored for now. The time sink, then, is in generating the CandidatePrimes list. I think my primes:queue/1 function is pretty fast at generating the sieve, and dividing N by two eliminates a lot of unnecessary computation, but when you're calling prime\factors/1 thousands of times, the call to queue/1 begins to add up. This is where I needed to save some state (the sieve) in between calls. Erlang, fortunately enough, has a module behavior called gen\server that abstracts away a lot of the server internals and lets you focus on the business bits of the server. I won't discuss it much here as I'm not an authority on it, but Joe Armstrong's book and the Erlang docs have been a great help in understanding what's happening behind the scene. You can view the prime\server module to see what its current state is code-wise. To speed up prime\factors/1, I split it into two functions prime\factors/1 and prime\factors/2. The functions look like this:

prime_factors(N, CandidatePrimes) ->
    PrimeFactors = [ X || X < - CandidatePrimes, N rem X =:= 0 ],
    find_factors(PrimeFactors, N, 0, []).

prime_factors(N) ->
    prime_factors(N, primes:queue(N div 2)).

Now, if we don't need to save the queue between calls you can still call prime\factors/1 as usual. The prime\server module utilizes the prime\factors/2 function because it initializes its state to contain a primes sieve (either all primes under 1,000,000 if no arg is passed to start\link, or all primes =< N when using start\link/1) and the current upper bound. Now when the server handles the call for getting the factors, we pass a pared down list of primes to prime\factors/2 and get a nice speed boost. Well, the heavy lifting is front-loaded in the initialization of the server (generating the sieve) and in calls that increase the sieve's size. One improvement there might be to save the Table generated during the initial sieve creation and start the loop back up from where it left off (when N > UpTo) but that is for another time. If you choose your initial value for start\link right, regenerating the sieve should be unnecessary. The last speed boost was noticing that calculating the exponents was an unnecessary step so I wrote a count\factors/1 and count\factors/2 that skips the call to find\factors/4 and returns the length of the list comprehension. With these changes complete, problem 47 went from taking well over 5 minutes to just under 20 seconds to solve brute force.

Purely Functional Data Structures and Me

Quick post to say that I've put my dabblings into Chris Okasaki's Purely Functional Data Structures book up as a github repo. I am picking and choosing what exercises and examples to code, with the goal being to slowly cover the whole book. Some concessions are made to fit the ideas into Erlang (like recursively calling an anonymous function), but overall I think the ideas fit nicely. There is a streams lib from Richard Carlsson that I found on the Erlang mailing list in the repo as well that I used for reference for a couple things in chapter 4. I stuck with my streams being represented either as an empty list [] or [term() | fun()] with term() being the calculated value and fun() being the suspension function, instead of the tuple that Richard chose. After reading further in the thread, I understand why (don't confuse people that might use proper list functions on streams) but for my small examples, it was enough to use lists.

Problem Solved

In the comments section of a recent Atwood post, commentor Paul Jungwirth (search for the name as I can't find comment permalinks) posted about a problem from a perl mailing list that he would give potential hires. This post is not about the blog post but about the problem from the comments section. The Problem (from the mailing list): Consider the following string: 1 2 3 4 5 6 7 8 9 = 2002 The problem is to add any number of addition & multiplication operations wherever you'd like on the left such that in the end you have a valid equation. So for example if it gets you to a solution you can have: 12 * 345 + 6 … if that works as part of your solution [it's much too big: 4146]. Bearing in mind that multiplication takes higher precedence than addition, what is the solution? My answer generator can be found here in Erlang. I liked the problem because it is in the vein of Project Euler. The eval/1 is a slightly modified version of this one on TrapExit.