Stateless Property-based testing with Erlang and PropEr

In preparation for my talk at CodeBEAM, I thought I'd give an introductory example of some stateless property-based tests as an introduction to property-based testing.

Fine reading and viewing options

First things first, get yourself a copy of the Property-Based Testing book.

Second, there are a number of blog posts and videos online that talk about property-based testing that are far better resources than I would write. In no particular order:

  1. An introduction to property-based testing (using F#) - I like the approach of an adversarial relationship between the code writer and the test writer. How can the test writer ensure the code writer is properly implementing a function without cheating on the implementation just to get tests to pass?
  2. Code checking automation - John Hughes breaks down testing some SMS processing code using Quickcheck and finding an inconsistency in the spec.
  3. Don't write tests! - John Hughes talks about why testing is hard and how property-based testing can make it easier (and maybe more fun!).
  4. Testing web services with QuickCheck - Using QuickCheck (and stateful property testing) to test web services.
  5. The Hitchhiker's Guide to the Unexpected - Fred Hebert talks about using property testing to test his supervisor hierarchy and how it reacts to failure (see here specifically).

Basically any video where John Hughes or Thomas Arts talk QuickCheck or property-based testing is worth a watch!

PropEr

We use the open source, QuickCheck-inspired, testing tool PropEr to develop our tests. The homepage for the project has links to tutorials and other helpful information specifically related to using PropEr.

The Code

Briefly, we want to test the higher-level properties of a function instead of explicitly enumerating the inputs and outputs we expect.

Let us consider the function camel_to_snake/1 which takes a Camel case string (such as ThisIsCamelCase) and converts it to Snake case (such as this_is_snake_case).

camel_to_snake(<<First, Bin/binary>>) ->
    iolist_to_binary([to_lower_char(First)
		     ,[maybe_camel_to_snake(Char) || <<Char>> <= Bin]
		     ]).

maybe_camel_to_snake(Char) ->
    case is_upper_char(Char) of
	'false' -> Char;
	'true' ->
	    [$_, to_lower_char(Char)]
    end.

is_upper_char(Char) ->
    Char >= $A andalso Char =< $Z.

to_lower_char(Char) when is_integer(Char), $A =< Char, Char =< $Z -> Char + 32;
to_lower_char(Char) -> Char.

Pretty simple implementation - if the character is uppercase, convert it to an underscore and the lowercase version of the character.

Now, if we were doing unit tests alone, we might write:

camel_to_snake_test_() ->
    Tests = [{<<"Test">>, <<"test">>}
	    ,{<<"TestKey">>, <<"test_key">>}
	    ,{<<"testKey">>, <<"test_key">>}
	    ,{<<"TestKeySetting">>, <<"test_key_setting">>}
	    ,{<<"testKeySetting">>, <<"test_key_setting">>}
	    ,{<<"TEST">>, <<"t_e_s_t">>}
	    ],
    [?_assertEqual(To, camel_to_snake(From))
     || {From, To} <- Tests
    ].

And for a relatively trivial function like this, we're reasonably confident those tests are adequate:

make eunit
...
  module camel_tests'
    camel_tests:40: camel_to_snake_test_...ok
    camel_tests:40: camel_to_snake_test_...ok
    camel_tests:40: camel_to_snake_test_...ok
    camel_tests:40: camel_to_snake_test_...ok
    camel_tests:40: camel_to_snake_test_...ok
    camel_tests:40: camel_to_snake_test_...ok
    [done in 0.018 s]
...

Let's see if we can get some more coverage of the input space with property tests!

Properties to test

We have a pretty basic property here - take a string and convert uppercase characters to _{lowercase}.

So how can we generate inputs and outputs that will help test the conversion?

Well, we can generate the CamelCase string character by character and build the snakecase version as we go.

First, the high level property test:

prop_morph() ->
    ?FORALL({Camel, Snake}
	   ,camel_and_snake()
	   ,?WHENFAIL(io:format("~nfailed to morph '~s' to '~s'~n", [Camel, Snake])
		     ,Snake =:= camel_to_snake(Camel)
		     )
	   ).

So `camelandsnake/0` is a generator that will return a 2-tuple with the built strings, will compare the call to `cameltosnake/1` against the generated Snake, and if the property fails (evals to false), we output the failing pair for analysis.

camel_and_snake() ->
    camel_and_snake(30). % we don't want overly huge strings so cap them to 30 characters for now

camel_and_snake(Length) ->
    ?LET(CamelAndSnake
	,camel_and_snake(Length, [])
	,begin
	     %% we create a list of [{CamelChar, SnakeChar}]; unzipping results in {CamelChars, SnakeChars} as iolist() data
	     {Camel, Snake} = lists:unzip(lists:reverse(CamelAndSnake)),
	     {list_to_binary(Camel), list_to_binary(Snake)}
	 end
	).

%% Create a list of [{CamelChar, SnakeChar}]
camel_and_snake(0, CamelAndSnake) ->
    CamelAndSnake;
camel_and_snake(Length, CamelAndSnake) ->
    CandS = oneof([upper_char()
		  ,lower_char()
		  ]),

    camel_and_snake(Length-1
		   ,[CandS | CamelAndSnake]
		   ).

%% Lower chars are easy - whatever the camel gets the snake gets too
lower_char() ->
    ?LET(Lower
	 ,choose($a,$z)
	 ,{Lower, Lower}
	).

%% Uppercase just requires a little math
upper_char() ->
    ?LET(Upper
	,choose($A,$Z)
	,{Upper, [$_, Upper+32]}
	).

Running it the first time we see:

(prop_morph)......!
Failed: After 4 test(s).
{<<70,74,101,101,81,90,81,72,112,117,84,74,106,100,100,67,90,98,98,122,80,107,81,77,99,79,113,82,84,99>>,<<95,102,95,106,101,101,95,113,95,122,95,113,95,104,112,117,95,116,95,106,106,100,100,95,99,95,122,98,98,122,95,112,107,95,113,95,109,99,95,111,113,95,114,95,116,99>>}

failed to morph 'FJeeQZQHpuTJjddCZbbzPkQMcOqRTc' to '_f_jee_q_z_q_hpu_t_jjdd_c_zbbz_pk_q_mc_oq_r_tc'

Shrinking ............................................(44 time(s))
{<<65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65>>,<<95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97>>}

failed to morph 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' to '_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a'

Ah, right, if the first character is uppercase, we don't want to introduce the underscore. Let's fix the generator to account for the first character situation:

%% First, we need to choose a different start
camel_and_snake(0, CamelAndSnake) ->
    CamelAndSnake;
camel_and_snake(Length, []) ->
    CandS = oneof([first_upper_char()
		  ,lower_char()
		  ]),
    camel_and_snake(Length-1, [CandS]);
camel_and_snake(Length, CamelAndSnake) ->
    CandS = oneof([upper_char()
		  ,lower_char()
		  ]),

    camel_and_snake(Length-1
		   ,[CandS | CamelAndSnake]
		   ).

%% We define first_upper_char to not include the underscore for the snake version
upper_char() ->
    ?LET(Upper
	,choose($A,$Z)
	,{Upper, [Upper+32]}
	).

Running this through PropEr gives us:

proper_test_ (prop_morph) .......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK: Passed 500 test(s).

Ship it!

Not quite ready

One of the joys of KAZOO is getting to learn about how telecom works in the wider world. This forces Unicode front and center when dealing with user input (on the plus side, we can use emojis for phone extensions!).

Looking at our code, we see we've naively handled just the Latin-based alphabet. For inspiration, we open up string.erl to see how it handles upper/lowercase conversions:

%% ISO/IEC 8859-1 (latin1) letters are converted, others are ignored
%%

to_lower_char(C) when is_integer(C), $A =< C, C =< $Z ->
    C + 32;
to_lower_char(C) when is_integer(C), 16#C0 =< C, C =< 16#D6 ->
    C + 32;
to_lower_char(C) when is_integer(C), 16#D8 =< C, C =< 16#DE ->
    C + 32;
to_lower_char(C) ->
    C.

to_upper_char(C) when is_integer(C), $a =< C, C =< $z ->
    C - 32;
to_upper_char(C) when is_integer(C), 16#E0 =< C, C =< 16#F6 ->
    C - 32;
to_upper_char(C) when is_integer(C), 16#F8 =< C, C =< 16#FE ->
    C - 32;
to_upper_char(C) ->
    C.

Let's adjust our generators first to see failing cases:

lower_char() ->
    ?LET(Lower
	,union([choose($a,$z)
	       ,choose(16#E0,16#F6)
	       ,choose(16#F8,16#FE)
	       ])
	,{Lower, [Lower]}
	).

first_upper_char() ->
    ?LET(Upper
	,union([choose($A,$Z)
	       ,choose(16#C0,16#D6)
	       ,choose(16#D8,16#DE)
	       ])
	,{Upper, [Upper+32]}
	).

upper_char() ->
    ?LET(Upper
	,union([choose($A,$Z)
	       ,choose(16#C0,16#D6)
	       ,choose(16#D8,16#DE)
	       ])
	,{Upper, [$_, Upper+32]}
	).

Running this, we get some nice failures:

 proper_test_ (prop_morph)...!
Failed: After 1 test(s).
{<<222,240,240,198,253,220,75,212,233,76,248,110,77,83,229,99,195,88,216,250,246,67,227,237,103,240,217,253,220,221>>,<<254,240,240,95,230,253,95,252,95,107,95,244,233,95,108,248,110,95,109,95,115,229,99,95,227,95,120,95,248,250,246,95,99,227,237,103,240,95,249,253,95,252,95,253>>}
failed to morph 'ÞððÆýÜKÔéLønMSåcÃXØúöCãígðÙýÜÝ' to 'þðð_æý_ü_k_ôé_løn_m_såc_ã_x_øúö_cãígð_ùý_ü_ý'

Shrinking ..............................................................(62 time(s))
{<<192,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65,65>>,<<224,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97,95,97>>}
failed to morph 'ÀAAAAAAAAAAAAAAAAAAAAAAAAAAAAA' to 'à_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a_a'

You can also see that, because we wanted 30-character strings, PropEr generates a shrunk version that is still 30 characters long. Let's inform PropEr that we want to constrain the length of the strings but be a little more flexible so PropEr can generate better failing test cases:

prop_morph() ->
    ?FORALL({Camel, Snake}
	   ,resize(20, camel_and_snake())
	   ,camel_and_snake()
	   ,?WHENFAIL(io:format('user', "~nfailed to morph '~s' to '~s'~n", [Camel, Snake])
		     ,Snake =:= camel_to_snake(Camel)
		     )
	   ).

camel_and_snake() ->
    ?SIZED(Length, camel_and_snake(Length)).

You can read more about resize/2 and the ?SIZED macro but basically they let PropEr know to constrain the size a bit but with more flexibility than a static length.

Running the tests now:

proper_test_ (prop_morph)...!
Failed: After 1 test(s).
{<<220>>,<<252>>}

failed to morph 'Ü' to 'ü'

Shrinking ..(2 time(s))
{<<192>>,<<224>>}

failed to morph 'À' to 'à'

Much easier!

Let's adjust our implementation to account for these upper/lower bounds:

is_upper_char(Char) ->
    (Char >= $A andalso Char =< $Z)
	orelse (16#C0 =< Char andalso Char =< 16#D6)
	orelse (16#D8 =< Char andalso Char =< 16#DE).

to_lower_char(Char) when is_integer(Char), $A =< Char, Char =< $Z -> Char + 32;
to_lower_char(Char) when is_integer(Char), 16#C0 =< Char, Char =< 16#D6 ->
    Char + 32;
to_lower_char(Char) when is_integer(Char), 16#D8 =< Char, Char =< 16#DE ->
    Char + 32;
to_lower_char(Char) -> Char.

And the tests now pass nicely:

 proper_test_ (prop_morph) .......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
OK: Passed 500 test(s).

Honing the skill

Thinking in properties is not necessarily intuitive at first. There are many choices for what properties to choose to test, so pick and choose which apply to the particulars of the code you're testing. Another helpful thing is to keep the properties as simple as possible at first. As you build confidence in the generators and property tests, you can layer on more complex properties to ensure the tests are encapsulating the properties of your code.

As you progress down the road of property testing, you will hopefully find that you force yourself to think more deeply about your code and hopefully head off issues before the tests locate them. A great exercise is to pair with a teammate, have one write the implementation and one write the property tests, and compete to see if the tester can find bugs in the implementation.