Spellcheck projects using aspell and Makefiles

Spellcheck using Aspell and Makefiles

With an international community surrounding Kazoo it is important to provide tools that help everyone stay on the same page with the development cycle.

One of the challenges, as a native speaker of English, is how easily our brains 'fix' misspelt words without us realizing it. Often enough, PRs are accepted where a key into a JSON structure is misspelled and goes unnoticed, time passes until someone realizes the error, and now JSON documents using that key must be supported to prevent previously working data from failing to work now (ask about Kazoo's history with 'wednesday' in the temporal routes callflow action!).

Fortunately there's a nice command line tool, GNU Aspell, that provides spell checking capabilities and custom dictionaries. Aspell does a great job suggesting alternatives when finding a potentially misspelt word, making it easy to find the correct word (or add a word to a personal dictionary).

Personal dictionary

A personal dictionary is helpful to maintain a word list separate and apart from the "default" English dictionary Aspell provides, especially concerning acronyms and jargon specific to the project. When you spellcheck files and find words that are correct but unknown to Aspell, you can add them so Aspell knows not to highlight them again. These words are added to the personal dictionary. Being a plaintext file, the personal dictionary is also easily included in source control so you can distribute it with your project.

Personal replacements

The replacement file can contain automatic replacements; when Aspell encounters a word in the replacement file, it will replace it automatically with the replacement word. Really handy for oft-misspelt words.


Aspell should be available through most GNU/Linux distros.

Makefile setup

You can easily add a make target to spellcheck a project. Create a splchk.mk file and add:

.PHONY = splchk

DICT = .aspell.en.pws
REPL = .aspell.en.prepl

	@$(file >$(DICT),personal_ws-1.1 en 0 utf-8)

	@$(file >$(REPL),personal_repl-1.1 en 0 utf-8)

splchk: $(DICT) $(REPL) $(addsuffix .chk,$(basename $(shell find . -name "*.md" )))

%.chk: %.md
	aspell --home-dir=. --personal=$(DICT) --repl=$(REPL) --lang=en -x check $<

Replace the $(shell find . -name "*.md") with whatever you want to use to find files to spellcheck.

When you run make splchk you will get the opportunity to check the spelling for each file found (if the file doesn't have any issues you won't deal with it). If you find a word that is correct but Aspell highlights it for repair, you can 'a'dd it which will include it in your personal dictionary.

Comparing Kazoo and other CPaaS solutions


First, what is a CPaaS? Communication Platform as a Service is the expanded acronym but it boils down to APIs and programmatic control of the phone system.

The power of CPaaS is giving you the control of each call in a way that makes sense for the your (or your business') needs. It lets you access information outside of the phone system (such as CRMs and other databases) to make decisions about how to route a particular call.

Another feature of CPaaS is the ability to get events out of the phone system - via webhooks or over a websocket typically. This gives you the ability to track a call's progress through the callflow, updating information in CRMs and other databases and lets you build robust applications on top of the phone network.

CPaaS providers often offer API access to manipulate running calls (such as programmatically hanging up a call, initiating a transfer, etc), fetching CDRs (call detail records - typically used when billing for calls), initiating new calls, and more.

A newer feature, a requirement of CPaaS providers these days, is having WebRTC-based phones (or a phone in your browser), minimizing the software needed to be installed and allowing companies to provide a branded phone in their web apps.

Most CPaaS providers allow you to process phone calls without being a telecom junkie by providing a simplified set of actions that may be performed on a given call. No longer do you need to know what the PSTN is or how SIP or SDP work - you get to build on top of an abstracted phone system and focus more on your business needs.

The competitive landscape

When you view the competitive landscape in telecom, it is easy to see that the big players (traditional telcos and UCaaS providers like Avaya, Broadsoft, and others) are recognizing the value of adding a CPaaS-type offering by purchasing CPaaS providers like Zang and Nexmo (via Cisco acquisition). Twilio stands apart as an independent CPaaS provider that IPO'd in 2016.

The market has shifted and CPaaS + UCaaS is the norm going forward; UCaaS companies are racing to catch up while CPaaS companies are trying to expand into more traditional telephony environments.

Technical Evaluation

Let's dive into a bit of the nitty-gritty and see how a handful of providers compare.

Real-time Call Control

Most CPaaS offer a small list of actions/verbs that can be returned from your program. The basics are covered by most providers.

Connecting callers to endpoints

When a caller calls in, typically you want them to talk to someone (or several someones). Below is a quick chart on the various endpoints supported by CPaaS providers for connecting the caller:

Action Kazoo Twilio Plivo Nexmo Tropo
Registered Device X X X X
SIP Endpoint
Conference X
Call Queue - X X X
User (find-me/follow-me) X X X X
Group X X X X
Page group X X X X

Some notes:

  • Kazoo is the only platform that allows devices to register directly to the platform to receive calls. Registration is the phone's way of saying "Send calls for me to this IP/port please". Devices can then move around (say a softphone on a laptop) but still be able to receive calls easily. The SIP endpoint, in contrast, is typically in a fixed location and is harder to move once setup. An existing PBX is the typical destination for SIP endpoints.
  • Twilio is the only provider that offers a "Queue" option in the public docs. Kazoo has a community-supported queuing applications called ACDc as well as a commercial offering called Qubicle, which can be leveraged for call queues. You can also simulate queues with conferences.
  • The 'User' endpoint is the ability to associate multiple devices with a user and ring all relevant devices when the user is dialed. Most providers allow you to specify multiple endpoints to dial but that association must be tracked on your side. With Kazoo, you can create a User and use that in your callflow instead.
  • The 'Group' endpoint is similar to the User in that you can add devices and users to a group and all associated devices will be rung.

    The biggest differentiate among the providers in this case comes down to the need for registration, insight into why a call failed to connect (some providers are quite opaque about the reason for failure), and additional features offered on each endpoint type (like setting custom ringtones).

Caller interactions

Most providers allow you to interact with the caller by playing audio files, generating dynamic text to read to the caller (TTS - text to speech), and collecting touch tones (DTMF) from the caller.

Action Kazoo Twilio Plivo Nexmo Tropo
Collect DTMF
Play Audio (MP3 or WAV)

Other basics offered

Action Kazoo Twilio Plivo Nexmo Tropo
Hangup a call X
Pause execution X
Record the caller
Redirect to different URL X
Reject a call X

Kazoo Differentiators

While CPaaS allows you to interact with phone calls at a higher level than processing SIP packets, the features above are fairly low-level as far as business concerns go. They make for great demos and getting a prototype out the door quickly.

Invariably, once you gain traction, feature requests come in and more functionality is needed. The question to ask is, do you have to do the work to implement the functionality or is it provided already?

For instance, say your prototype connects the caller, based on some fancy criteria, to an agent/sales person/fortune teller. You gain traction and now your callees (the agents/sales people/fortune tellers) would love to have a voicemail service in case they can't answer the phone (since most are using a personal mobile phone and don't want the caller to hear their personal voicemail).

Do you want to build a voicemail box?

With Kazoo, you can configure a voicemail box via API and add it to the callflow with minimal effort. With the other providers, you're looking at having to get the IVR right, get the prompts recorded, test all the paths through a voicemail box, figure out storage of the voicemail recordings, build an API for providing the voicemail messages, build voicemail->email and attach the recording, archive/delete old/deleted messages, and more. This is not quick to build and, frankly, most people don't consider voicemail a feature worth paying much, if anything, for; is it a good use of your time?

Kazoo offers a number of callflow actions, like voicemail, that allow you to add higher level, business-focused, actions to your call processing capabilities. A short list of highlights:

  • Call Parking
  • Directed / Group Call Pickup
  • Eavesdrop
  • Faxing
  • Hotdesking
  • IVRs
  • Manual presence updates
  • Time-of-day routing
  • Voicemail

Each of these features represents days to weeks of work to implement yourself (if you even can given the primitives offered by the other providers) and each has its own set of edge cases and intricacies that come with it. Instead, why not use the pre-built, heavily-tested callflow actions provided by Kazoo?

API control

The single largest difference between Kazoo's REST API and other providers is the sheer breadth of APIs exposed by Kazoo. Remember, Kazoo is both a CPaaS and UCaaS platform; on top of that, everything is API driven so you while you might start with the CPaaS-specific features, you can easily grow into UCaaS type services (like office PBX features).

Call Control APIs

Each provider allows you to list and control active calls. What controls you have vary; some allow you to redirect the executing callflow to your server for new callflow instructions; others expose the actions you can overlay on the call via the API and that's it.

Similarly, conferences and conference participants can be managed through most providers' APIs.

All in all, there isn't much that differentiates the basics among providers.

Kazoo API Differentiators

  • Multi-tenancy

    Kazoo is designed to be multi-tenant; you can have as deep or as broad of an account tree as you like. Most providers, like Twilio, Nexmo, and Tropo, don't provide multi-tenancy at all (Plivo does appear to though). As a service integrator, you will have to manage that hierarchy on your side. This means you have to build infrastructure to manage billing, resource usage, configuration, etc.

  • Carrier management

    Kazoo allows you to define your own upstream carriers, so even if you're using a hosted instance of Kazoo (versus running it yourself), you should be able to add your own carriers to you account and use them instead of the system's carriers. You can even setup hybrid situations where you route certain destinations to your carriers and use the system's as the default carrier group.

    None of the other CPaaS providers offer BYOC (Bring Your Own Carrier). This means that as you grow your voice traffic, you will not have levers to pull to manage costs. With BYOC, when it makes sense to, you can find your own deals on connectivity and seamlessly switch to your own carriers to save costs.

  • Finer-grained features

    Each API entity (such as devices or conferences) in Kazoo offers significantly more functionality compared to the other providers' equivalent entities. Of course there are some similarities but, for instance looking at a "device" in Kazoo, you can optionally define:

    • Call forwarding settings, including for failover (or just straight call-forwarding)
    • Call recording (both inbound and outbound) for the device
    • Call restrictions based on number classifiers (can call US DIDs, can't call international, etc)
    • Caller ID settings
    • Dialplan settings (allow users to dial shorter numbers and "fix" them - for instance the US used to allow 7-digit dialing and the area code was assumed based on the phone's location).
    • Do Not Disturb settings
    • Codecs available (both audio and video)
    • Custom music on hold
    • Custom presence ID (instead of the SIP username)

    There are other Kazoo-specific features on devices as well. Since users can own devices, you can set similar properties on the user and have them propagate to all devices (and define properties on the devices to override user settings).

  • Class-4 and Class-5 Switching

    CPaaS providers are typically concerned with a handful of phone system features - dialing endpoints, conferences, call recording, call queue, and phone number management most commonly.

    Kazoo, being a hybrid class-4 and class-5 switch, offers so much more:

    PBX functionality includes configuring users, devices, and callflows, blacklists, directories, faxes and faxboxes, IVRs, voicemail, and more.

    Configure connectivity features like carriers (resources in Kazoo parlance), ratedecks, PBX trunking.

    Configure reseller features like branding email notification templates and white-labeling Kazoo services.

Questions to ask a CPaaS provider

These are some good questions to ask your CPaaS provider(s) and to ask yourself how important the answers are to your needs:

  • Where are the voice servers located, geographically?
  • Where are the API servers located? Hint, some providers may have voice spread out around the world but API servers only in geographic location.
  • Where are the servers located (the data centers) relative to your users? To your carriers?
  • How is the networking built? Some hosted infrastructure providers, like AWS/GCE aren't well suited to VoIP traffic - are CPaaS providers piggybacking off that infrastructure or are they building out their own racks?
  • Is the CPaaS provider reselling another provider's service with better lipstick?
  • Can you extend the platform?


The market has spoken and they want UCaaS+CPaaS offered under one roof. The big UCaaS players have bought their CPaaS plays and are busy integrating them into the fold with varying success. Meanwhile Twilio remains an outlier and looks to be doubling down on being the CPaaS for the masses.

Kazoo offers you CPaaS + UCaaS from day one in a cohesive system. You can use as many or as few APIs and callflow actions as you need to accomplish your goals, while knowing that as you grow and need new functionality, Kazoo is there ready to provide it. On the off chance Kazoo isn't able to provide a feature, you can rest easy knowing the open-source nature of the project gives you the option to build it yourself and keep it private, build it and release it upstream for all to benefit, or support the project by contracting with 2600Hz to build the functionality for you.

Ready to kick the tires?

Site improvements - Headers and Certs edition

Analyzing jamesaimonetti.com

After reading Julia Evans' post on the HSTS header, I felt like I should add it to this site. This lead me to Mozilla's Observatory that will analyze your site and grade it. Like Julia, I received a fun F - the biggest dings were missing headers and an invalid certificate chain. Let's look at how I improved my site's scoring.

The initial site

My blog is a Nikola-powered site. Since the pages are fronted by Apache (for the time being) most of the changes need to be applied there (as Apache controls the response headers). Let's add some headers!

Setting the HTTP Strict Transport Security (HSTS)

This post talks more about preload and other considerations.

On the https (port 443) virtual host, its a simple one-line addition:

Header always set Strict-Transport-Security "max-age=63072000; includeSubdomains; preload"

The max_age is in seconds (63072000 is 2 years).

Setting the Content Security Policy (CSP) header

Mozilla has great help text about the header here.

Header always set Content-Security-Policy "default-src 'none'; font-src 'self'; img-src 'self'; object-src 'none'; script-src 'self'; style-src 'self'"

I've opted to keep everything in-house-only. If I start hosting images or fonts on other sites I can add them in as exceptions.

Certificate issues when redirecting

The redirection error seemed inappropriate as I use a permanent redirect from http to https:

<VirtualHost *:80>
	ServerName jamesaimonetti.com
	Redirect permanent / https://jamesaimonetti.com

So I went digging a bit more.

One of the ideas was to support HTTP public key pinning (HPKP). A cursory view of Let's Encrypt discussion suggested that this might not be worth the trouble for a relatively low-value site like mine.

As it turns out, the addition of the HSTS header will likely improve this metric! Once we reload the site and rescan it with Observatory, we'll find out.

Setting the X-Content-Type-Options header

Pretty straightforward explanation and easily added:

Header always set X-Content-Type-Options "nosniff"

Setting the X-Frame-Options header

I don't want my site in an iframe (see here) so in it goes:

Header always set X-Frame-Options "DENY"

This also adds a setting to the CSP header:

Header always set Content-Security-Policy "frame-ancestors 'none'"

Setting the X-XSS-Protection header

Seems mostly for legacy browsers that don't support CSP (see here).

Header always set X-XSS-Protection "1; mode=block"

Testing the setup

Initial Errors

Once the headers were set, I reloaded the site in Apache and reloaded it in my browser. I verified the response headers were included but now I had a bunch of console errors to handle:

Content Security Policy: The page’s settings blocked the loading of a resource at self (“script-src https://jamesaimonetti.com”). Source: onfocusin attribute on DIV element.  jamesaimonetti.com
Content Security Policy: The page’s settings blocked the loading of a resource at self (“script-src https://jamesaimonetti.com”). Source: $('a.image-reference:not(.islink) img:no....  jamesaimonetti.com:637
Content Security Policy: The page’s settings blocked the loading of a resource at self (“script-src https://jamesaimonetti.com”). Source:
    fancydates....  jamesaimonetti.com:637
Content Security Policy: The page’s settings blocked the loading of a resource at self (“script-src https://jamesaimonetti.com”). Source:
  (function(i,s,o,g,r,a,m){i['GoogleAna....  jamesaimonetti.com:640

Google Analytics

Looking at the CSP FAQ about adding the Google Analytics, it looks like I need to whitelist Google in the img-src portion of my CSP.

Header always set Content-Security-Policy "img-src 'self' www.google-analytics.com"

This in and of itself did not address the issue as the inline javascript used to do the analytics isn't executed. The recommended way is to load the analytics javascript in its own file (vs inline in the html page). Currently in my conf.py I've added a BODY_END snippet with the google analytics setup. I've moved that snippet to /files/assets/js/ga.js and included it in my EXTRA_HEAD_DATA config:

<script src="/assets/js/ga.js" type="text/javascript"/>

Now I need to add google-analytics to my CSP script-src as well:

Header always set Content-Security-Policy "script-src 'self' www.google-analytics.com"


With these changes above, scorecard improved to a solid B!

Still getting dinged with the HSTS header and certificate chain issues.

Turns out I had only included 2 of the three SSL-related directives in Apache. The third line below was added:

SSLCertificateKeyFile "/etc/letsencrypt/live/jamesaimonetti.com/privkey.pem"
SSLCertificateFile "/etc/letsencrypt/live/jamesaimonetti.com/cert.pem"
SSLCertificateChainFile "/etc/letsencrypt/live/jamesaimonetti.com/chain.pem"

chain.pem includes the intermediate certificates needed to complete the chain. See SSL Labs' SSLTest to find out more about your site.

Forward secrecy

For funzies, adding forward secrecy to Apache to bump those scores up (yay gamification!).

Disable RC4

RC4 is broken; don't offer it!

I added a ssl.conf in conf.d with this:

SSLProtocol all -SSLv2 -SSLv3
SSLHonorCipherOrder on

Org-Babel and Erlang Escripts

I recently wrote an escript for work that used the wonderful getopt application for processing command-line arguments. Part of that script includes printing the script usage if passed the `-h` or `–help` flag.

As part of my initiative to use orgmode's literate programming capabilities, I wrote a `README.org` for the script that described how to install pre-requisites, what the script does, etc. I also wanted to include the usage text as part of the doc, since part of what I've been doing was adding more flags to tune the operations of the script.

The tricky thing is that the real usage text and what was in the doc frequently grows apart as development occurs. Wouldn't it be great to have the README.org stay in sync with what the current usage flags are?

Org-babel SRC blocks to the rescue!

With the appropriate SRC block, you can capture the results as part of the doc when exporting (or just capture the results in place) to a new format.

For my use case, I used a simple shell SRC block (with appropriate headers) to run the script with the appropriate flag:

./my_escript -h

Evaluating the SRC block with `C-cC-c` creates the attending `#+RESULTS:` section below. Except…it was empty. I couldn't figure out why the script's output wasn't showing up. Simple shell commands like `pwd` showed up just fine. What do?


Ah yes; it turns out that, by default, getopt prints usage to stderr instead of stdout. A quick fix to call getopt:usage/3 with `standard_in` as the third argument and the results showed up as expected.

I'm sure there are some other tricks to getting org-babel to get stderr output into the results but from my cursory searching it looked a bit more convoluted and not worth the time when I could fix it with such a small change.

Formatting tangled output in org-mode

I have been creating training materials to help people get comfortable with Kazoo. Part of those training materials include Erlang source code, JSON schemas, CouchDB design documents, and more. I have been using a literate programming style to create these materials, which include source blocks for the Erlang or JSON pieces. However, when "tangling" (extracting) the source blocks into their own files, the formatting (especially for the JSON) was terrible.

I immediately saw that ob-tangle.el has a hook, org-babel-post-tangle-hook, that executes after the tangling has occurred. What I didn't really understand, and took many false starts to figure out, was how this hook executed and in what context.

What I learned about org-babel-post-tangle-hook

After many false starts and brute-force attempts to make it work (so many lambdas), I took a day away from the issue and came back with both fresh eyes and a beginner's heart.

Here's the process of tangling, as far as I have gleaned thus far:

  1. run `org-babel-tangle` (`C-cC-vt` for most)
  2. The source blocks are tangled into the configured file(s) and saved
  3. When there are functions added to `org-babel-post-tangle-hook`, a buffer is opened with the contents of the file and the functions are mapped over.
  4. Once the functions have finished, the buffer is killed.

Read the source, Luke

Looking at the source:

(when org-babel-post-tangle-hook
   (lambda (file)
     (org-babel-with-temp-filebuffer file
       (run-hooks 'org-babel-post-tangle-hook)))
   (mapcar #'car path-collector)))

First, we see that nothing will be done if org-babel-post-tangle-hook is nil. `(mapcar #'car path-collector)` takes the `path-collector` list and applies `#'car` to each element (I'm not sure what `#'` before `car` is for). The resulting list will be used by `mapc` which we can read is like `lists:foreach/2` in Erlang - applying a function to each element in the list for its side-effects. That anonymous function (lambda) takes the element from the list (the filename I believe) and calls `org-babel-with-temp-filebuffer` with that and an expression for running the hooks.

Summarizing, if there are hooks to be run, call a function for each file that was tangled.

So what does 'org-babel-with-temp-filebuffer` do? From the lovely Help for the function: "Open FILE into a temporary buffer execute BODY there like ‘progn’, then kill the FILE buffer returning the result of evaluating BODY."

(defmacro org-babel-with-temp-filebuffer (file &rest body)
  "Open FILE into a temporary buffer execute BODY there like
`progn', then kill the FILE buffer returning the result of
evaluating BODY."
  (declare (indent 1))
  (let ((temp-path (make-symbol "temp-path"))
	(temp-result (make-symbol "temp-result"))
	(temp-file (make-symbol "temp-file"))
	(visited-p (make-symbol "visited-p")))
    `(let* ((,temp-path ,file)
	    (,visited-p (get-file-buffer ,temp-path))
	    ,temp-result ,temp-file)
       (org-babel-find-file-noselect-refresh ,temp-path)
       (setf ,temp-file (get-file-buffer ,temp-path))
       (with-current-buffer ,temp-file
	 (setf ,temp-result (progn ,@body)))
       (unless ,visited-p (kill-buffer ,temp-file))

Here we get deeper into Emacs Lisp than I really know, so I'll shoot a bit in the dark (dusk perhaps) about the functionality. `file` is our tangled file from the lambda and `body` is the `run-hooks` expression (not yet evaluated!). Basically, this code loads the contents of `file` into a buffer and folds `body` over that buffer. When finished, it kills the buffer.

Killed, you say?

Yes! So any formatting you may have applied to that temporary buffer is lost.

What to do?

We need a way to format the buffer, in a mode-aware way, and have that persist to the tangled file before the buffer is killed at the end of processing the hook.

The blessing and curse of Emacs is that all is available if you have the time and inclination! :)

The current implementation

A couple pieces of the puzzle, arrived at independently, put me on the right path.

First, with all of my Kazoo work, I wanted to ensure that the source files were all properly indented according to our formatting tool (which, of course, uses Erlang-mode's formatting!). Using a couple hooks to accomplish this gave me:

(add-hook 'Erlang-mode-hook
          (lambda ()
            (add-hook 'before-save-hook 'Erlang-indent-current-buffer nil 'make-it-local)))

So now I have an Erlang-mode specific `before-save-hook` action to indent the buffer prior to saving the buffer to the file.

I can, in turn, apply a similar hook into the js-mode (or js2-mode or JSON-mode) and use `JSON-pretty-print-buffer-ordered` to format the buffer. As long as I have a function that formats the current buffer properly, I can create the `before-save-hook` to ensure the buffer has formatting applied prior to saving.

The final piece was to figure out how to tie all this into `org-babel-post-tangle-hook`:

(defun my/run-before-save-hooks ()
  (run-hooks 'before-save-hook)

(add-hook 'org-babel-post-tangle-hook 'my/run-before-save-hooks)

What I finally came to was that, now that I had hooks available before saving for each major-mode I was interested in, and the buffer opened after tangling had the associated major-mode applied, all I needed was a way to run the `before-save-hook` hook and save the buffer before ceding control back to the `org-babel-post-tangle-hook` and `org-babel-with-temp-filebuffer` progn.

I am pretty happy with the result as so far I haven't encountered any glaring issues or performance problems. I hope others will find this useful, provide feedback on better, more idiomatic ways to accomplish the task, but overall I'm happy with the solution.

Unexpected wins

For whatever reason, when creating my .Emacs and attending customizations, I was mostly copy/pasting snippets I found (chalk it up to those dark PHP days). I didn't really take the time to grok what was happening, and most were `setq` or `define-key` anyway, so pretty simplistic. I haven't worked with Lisp in any real capacity since college (over a decade ago now - yikes) but with the last 8 or so years immersed in Erlang, I was familiar and comfortable with functional programming.

Having to dig into these functions because no one had written an easy guide to copy/paste was just the kick I needed to realize that, holy cow, I really did kind of understand what's going on here! There was a specific moment where I was ruminating on the code running the post-tangle hook, and reading the `mapc` description "Apply FUNCTION to each element of SEQUENCE for side effects only." clicked in my head as "Hey, that's lists:foreach/2"!

Navigating Emacs' help system, specifically 'describe function' (`C-hf`), has also been a boon, and I am more than grateful to the developers who've not only written the docs but the foresight to have it so easily accessible in Emacs. It has given me the genesis of an idea for a non-trivial Elisp project I want to attempt.


I haven't had a challenge like this in quite some time. Granted, on the surface it is a pretty silly, simple thing to have accomplished. On its own merit, probably not worth the time I spent. But the confidence gained in at least reading and comprehending Elisp code a little more fully, and the spark of an idea to try out, should more than compensate for that time. Now I just need to follow through and hopefully contribute a little back to the Emacs community.

The Great Migration of 2016

The Great Migration of 2016

It has been a long time since I've blogged for fun. A lot has changed and a lot has remained.

It is my goal to start writing more about what I'm up to, as much for an archive for my kids, family, and friends to read as it is to just flex the writing muscles. Most posts will continue on the nerdy theme as relates to computers and programming. However, I do plan to write up summaries about activities that involve the family and friends, for when nostalgia or curiosity about a time in life comes up.

Initially, though, I will start to write up a series of posts about how I'm consolidating as much as possible of my digital life into servers and services that I run, and using Emacs to interact with those services as much as possible.

Blog migration

I was a happy Wordpress user and developer when blogging first became a "thing". Cranking out plugins helped pay the bills out of college and blogging about technical things is ostensibly why I got a job at 2600Hz in 2010.

There are certainly ways to interact with Wordpress installations via Emacs but in the end, I wasn't happy with them and wanted something more streamlined. Through some series of events, I came across Nikola and appreciated the minimalist nature of the default installation, the ease with which I migrated existing posts from Wordpress, and the ability to manage posts using Emacs' org-mode, with which I've made a conscious effort to learn this year as well.

Git migration

Part of the appeal is that I can now put my posts, as they're static files, into version control. I've setup Gogs on my server and am transitioning my personal repos to it (and off of GitHub). I also have the full power of the command line (grep, awk, sed, etc) to work with my blog's corpus.

Going forward

I'm excited by the prospects of these (and other) changes. My goal has been to reduce the applications I use with regularity to two: Emacs and a browser. The more I can accomplish in Emacs, the less friction there is to me getting things done, which is part of why I'm so excited. Emacs is a tool that has gotten out of my way to the point that I don't even think about most keybindings I use. Emacs has become a natural extension of my thought process, and as long as my fingers can keep up with my mind, there's no impedance from my editor.

Hopefully this is the restart of my blog; no excuses aside from laziness now!

Using ibrowse to POST form data

It is not immediately obvious how to use ibrowse to send an HTTP POST request with form data (perhaps to simulate a web form post). Turns out its pretty simple:

    ibrowse:send_req(URI, [{"Content-Type", "application/x-www-form-urlencoded"}], post, FormData)

Where URI is where you want to send the request ("http://some.server.com/path/to/somewhere.php") and FormData is an iolist() of URL-encoded values ("foo=bar&fizz=buzz"). There's obviously a lot more that can be done, but for a quick snippet, this is pretty sweet.

Emulating Webmachine's {halt, StatusCode} in Cowboy

At 2600Hz, we recently converted our REST webserver from Mochiweb/Webmachine to Cowboy, with cowboy\_http\_rest giving us a comparable API to process our REST requests with. One feature that was missing, however, was an equivalent to Webmachine's {halt, StatusCode} return. While there has been chatter about adding this to cowboy\_http\_rest, we've got a function that emulates the behaviour pretty well (this is cleaned up a bit from our actual function, removing project-specific details).

    -spec halt/4 :: (#http_req{}, integer(), iolist(), #state{}) -> {'halt', #http_req{}, #state{}}.
    halt(Req0, StatusCode, RespContent, State) ->
        {ok, Req1} = cowboy_http_req:set_resp_body(Content, Req0),
        {ok, Req2} = cowboy_http_req:reply(StatusCode, Req1),
        {halt, Req2, State}.

Obviously you can omit setting the response body if you don't plan to return one.

CouchDB/BigCouch Bulk Insert/Update

While writing a bulk importer for Crossbar, I took a look at squeezing some performance out of BigCouch for the actual inserting of documents into the database. My first time running all the documents into BigCouch at the same time resulted in some poor performance, so I went digging around for some ideas on how to improve the insertions. Reading up on the High Performance Guide for CouchDB (which BigCouch is API-compliant with), I started to play with chunking my inserts up to get better overall execution time. Note: the following are very unscientific results, but I think are fairly instructive for what one might expect.

Docs Per Insertion Elapsed Time (ms)
26618 107176
1000 8325
1500 5679
2000 3087
2500 1644
Docs Per Insertion Elapsed Time (ms)

Based on the CouchDB guide, I decided to not pursue this further, as dropping insertion time 2 orders of magnitude was fine enough for me! I may have to bake this into the platform natively. For those interested in the Erlang code, it is pretty simple. Taking a list of documents to save, use lists:split/2 to try and split the list. By catching the error, we can know that the list is less than our threshold, and can save the remaining list to BigCouch. Otherwise, lists:split/2 chunks our list into one for saving, and one for recursing back into the function. Since we don't really care about the results of couch\_mgr:save\_docs/2, we could put the calls in the second clause of the case in a spawn to speed this up (relative to the calling process).

    -spec save_bulk_rates/1 :: (wh_json:json_objects()) -> no_return().
    save_bulk_rates(Rates) ->
        case catch(lists:split(?MAX_BULK_INSERT, Rates)) of
            {'EXIT', _} ->
                couch_mgr:save_docs(?WH_RATES_DB, Rates);
            {Save, Cont} ->
                couch_mgr:save_docs(?WH_RATES_DB, Save),

Life Update

Updated the blog to run 3.3.1 - lot of cobwebs around these parts. Hopefully I can be more proactive in blogging about things going on at work, and perhaps starting to write about what I'm up to personally (not that I have much of that right now). Maybe my Google stats will jump over the 0.3 hits I average! Dare to dream!