Archive for the ‘Research’ Category

Writing Excel Spreadsheets Using PHP

Thursday, July 24th, 2008

When using the Spreadsheet_Excel_Writer library from the PEAR repository, I came across an issue I didn’t see handled in the docs (as of this writing, I am using Spreadsheet_Excel_Writer 0.9.1 beta)

My application creates spreadsheets that contain order information. Part of each row is a list of up to 20 ISBNs and the quantities desired of each. The issue came in how to handle ISBNs that had a leading zero. When I first looked through the PEAR docs for the library, a Worksheet method, writeString, looked to be the solution. However, the end result was that while the leading zero was maintained, the cell’s format was still numeric. This resulted in the application receiving the generated xls to then drop the zero, resulting in an invalid ISBN.

Looking over the internals of the Worksheet::writeString method didn’t reveal an undocumented feature that would ensure a cell was read as text, regardless of its contents. I next looked at the Format::setNumFormat method as I knew it contained ways to format the number as currency, timestamp, fractions, etc. You could then pass this Format object as the optional fourth parameter to the Worksheet::write method.

Contained in the Format::setNumFormat docs was a link to the OpenOffice.org documentation of the Excel File Format (found here, pdf). Interested in how exactly the file was structured, I read on. What I learned that was directly applicable is that each cell contains a pointer to a format definition, or XF record, and it was this XF record where formatting was stored. From the doc, section 4.6:

All cell formatting attributes are stored in XF records…The cell records themselves contain an index into the XF record list. This way of string cell formatting saves memory and decreases the file size.

So if two cells use the same formatting, like the ISBN columns would, each cell would contain a pointer to the XF record that would tell Excel the cell was text. Seciton 4.6.1 lists the 6 groups of formatting attributes, the first of which is number format, which is then an index to a FORMAT record. Okay, we’re on to something here. Further in the pdf, in section 5.49, we see the definition of the FORMAT record. Lo and behold, the table of formats from the setNumFormat page is listed in the pdf, but we see that the PEAR listing is incomplete. Scanning the complete table in the pdf, we see index 49, type Text, format string ‘@’. Bingo.

Our code for formatting numeric data as text in a string goes a little something like this (modified from the PEAR example code):

$workbook = new Spreadsheet_Excel_Writer();
$worksheet =& $workbook->addWorksheet();
 
// We'll show dates with a three letter month and four digit year format
$text_format =& $workbook->addFormat();
$text_format->setNumFormat('@');
 
$worksheet->write(0, 0, "Without formatting");
$worksheet->write(0, 1, '0123'); // cell contains 123
 
$worksheet->write(1, 0, "With formatting");
$worksheet->write(1, 1, '0123', $text_format); // cell contains 0123

To verify, generate the xls and open it. Right click the cells to modify the format of the cell, and see that the first cell is formatted as a general number, and the second cell is formatted as text.

The meta-moral is to read the docs and follow references to get at the source material. Had I not opened the pdf, it may have been a few more time units finding the information on Google. Plus, I learned a lot more about an important file format. I can sleep easy knowing I’m that much more knowledgeable.

Announcements

Monday, April 28th, 2008

Thought I’d go ahead and announce, mainly to myself, that I will be working through SICP. The rub…doing it in Javascript. Seems as though most other languages are covered (I know Erlang is taken) and since I am doing an increasingly large amount of Javascript, coupled with the eventual prevalence of server-side Javascript, I figured it best to start getting intimate. What I like about this task is that since SICP has been so widely covered on the web, I have many resources to aid in better understanding the material (and it is some thick material). Anyway, I’ve begun chapter one and will post the chapters, as well as excerpts I find interesting, in no pre-defined timeframe.

Oh yeah, and I’m engaged.

Michael Scheuer and Ron Paul

Friday, February 22nd, 2008

There may not be a better authority on Osama Bin Laden than Michael Scheuer. Intimately involved with the tracking down of bin Laden from at least 1996 to 2004, Scheuer has also written several books dealing with the handling of the Middle East by our government over the last 40 years or so. So when this man speaks on the Middle East and bin Laden, he speaks with authority that is unrivaled by most anyone else.

With that in mind, have a listen to an interview he gave recently (video was posted February 19th, 2008). Two things, among the many, that I found enlightening: first, Scheuer makes it clear that bin Laden and al Queda do not hate us for our freedoms, liberties, etc, as Bush, Guiliani, et al, have tried to ram into the conversation. It is about our involvement in the Middle East that bin Laden fights against. Second, of all the presidential candidates, Scheuer believes Ron Paul “gets it”.

Having a girlfriend invested in the green movement, I’d be remiss to comment on another bit of Scheuer’s interview and I’ll paraphrase my understanding of his remarks. When you look at the Middle East, what is there that is of interest to the US? Oil, obviously. Hmm… umm… gee… well… Not much else! So if our government were to get serious about investing in green energy and not using it to garner support while not producing results, we’d lose our interest in the Middle East, at least from a policy perspective. So using less oil = stopping the support of terrorism!

Here’s my beef for the current crop of Bush clones: they preach Christianity, family values, and all that. Fine, all well and good. However, it seems not a week goes by that a Bush supporter in the Congress doesn’t go down in flames with a scandal. But most important of Jesus’ teachings was the Golden Rule: Do unto others as you would have them do unto you. Take a moment and think back to how mad, upset, fearful, irate, how incensed with anger you were on September 11th, 2001. Planes were high-jacked and rammed into the World Trade Center. Now imagine that level of violence occurring everyday. Now double it, and double that. You’re starting to get an approximation of what the average Iraqi goes through each day. Imagine having to cope with all of those emotions everyday. Imagine, upon waking, not knowing whether you’d live or die in some suicide blast or gun fight. A terrible world for anyone. And yet our government is subjecting a country to that everyday with their occupation.

I implore you, regardless of political affiliation, to investigate Michael Scheuer’s writings and interviews and see an expert talk about the situation. Then compare that to the view given by your favorite candidate.

More important than any other issue today is our foreign policy, including the economy. Our next president must understand what’s at stake with respect to the rest of the world, not with respect to a small part of it (namely the European Union countries).

The slim silver lining here is two-fold. Wean our economy and way of life off oil and we lose interest in the Middle East, and, support Ron Paul and bring a man into office that is knowledgeable in both foreign policy and the economy.

Join the REVOLUTION

Fibonacci stream

Monday, February 4th, 2008

The Fibonacci sequence: teacher of recursion for so many Computer Science students. But can it also teach us about streams? Yes! The function to build a stream (in Javascript):

var fib_stream_maker = function() {
  return (function() {
    var a = 0;
    var b = 0;
    var c = 1;
 
    return function() {
      a = b;
      b = c;
      c = a + b;
      return c;
    }
  })();
}

Let’s break this down so Ben can follow along. First, the inner most function:

return function() {
  a = b;
  b = c;
  c = a + b;
  return c;
}

This anonymous function returns the next number in the sequence. Looking one level of scope higher, we see the variables a, b, and c declared:

function() {
  var a = 0;
  var b = 0;
  var c = 1;
 
  return function() {
    a = b;
    b = c;
    c = a + b;
    return c;
  }
}

So we initialize the variables and then return a function that increments those variables, local to the inner function’s scope, so we can have multiple instances of the function running without collision of variables. To build the stream factory, we wrap the initialization function in parentheses, creating a continuation, and then execute the wrapped function, creating an initialized fib stream function. This is then set to return when the fib_stream_maker is called as a function.

var fib_stream = fib_stream_maker();
 
fib_stream(); // returns 1
fib_stream(); // returns 2
fib_stream(); // returns 3
fib_stream(); // returns 5
fib_stream(); // returns 8, etc...

This is only touching the surface of streams but I thought it was pretty cool that the Fibonacci sequence can be utilized to teach such important concepts as continuations and streams. Code on!

*Update*

While explaining the inner workings of the above code to CD, she asked a very astute question, one that Ben probably would not have, “What about going backwards in the stream?” A just question. Let’s look at a table of how the values of a, b, and c act as the stream goes forward first (this will help us design our algorithm later).

  a b c Returns
Creating fib_stream 0 0 1  
fib_stream(); 0 1 1 1
fib_stream(); 1 1 2 2
fib_stream(); 1 2 3 3
fib_stream(); 2 3 5 5

So to go backwards, what do we need to do to values of a, b, and c? We need to assign b’s value to c, a’s value to b, and the difference of c and b to a, and still return c. In code, this would look like:

c = b;
b = a;
a = c - b;
return c;

To integrate this into the above example, we need to pass a boolean parameter to the inner-most lamda function and test to see whether to forward or reverse the stream. Let’s see the inner-lamda now:

return function(go_forward) {
  if ( go_forward ) {
    a = b;
    b = c;
    c = a + b;
  } else {
    c = b;
    b = a;
    a = c - b;
  }
 
  return c;
}

We may want to put an additional test to see if we are at the beginning of the stream again:

return function(go_forward) {
  if ( go_forward ) {
    a = b;
    b = c;
    c = a + b;
  } else if ( a == 0 && b == 0 ) {
    ; // do nothing
  } else {
    c = b;
    b = a;
    a = c - b;
  }
 
  return c;
}

Let’s see it all as one big fun function:

var fib_stream_maker = function() {
  return (function() {
    var a = 0;
    var b = 0;
    var c = 1;
 
    return function(go_forward) {
      if ( go_forward ) {
        a = b;
        b = c;
        c = a + b;
      } else if ( a == 0 && b == 0 ) {
        ; // do nothing
      } else {
        c = b;
        b = a;
        a = c - b;
      }
      return c;
    }
  })();
}

And there you have it. Thanks CD for bringing up this interesting idea! What a nerd!

Save $150 Billion Much?

Friday, February 1st, 2008

It’s a sad state of affairs that we as voters must wade through so much posturing and pandering by the presidential candidates. Particularly hard is when we find out after they are elected that their words were vaporous and meaningless. What’s a voter to do?

A study released January 29th of this year by the National Taxpayers Union Foundation helps tax payers cut through the facade and see the real truth, expressed in hard numbers. Keep in mind that the United States has trillions of dollars ($9,200,874,834,693.13) in debt, is spending close to $1 trillion a year on Iraq and Afghanistan, and is facing the housing market implosion. Now, with all this economic unrest, which candidates actually want to do something about it and save some dough?

The four respective frontrunners in the two parties (John McCain, Mitt Romney, Hillary Clinton, and Barack Obama), proposed overall fiscal policy agendas whose net effect would raise annual federal outlays between $6.9 billion and $287.0 billion.

The top-tier GOP candidates often portrayed as “conservative” (Mitt Romney and Mike Huckabee) actually called for significantly larger spending hikes ($19.5 billion and $54.2 billion, respectively), than the so-called “moderate conservative” (John McCain, $6.9 billion).

Among Democrats, Barack Obama, often described as ideologically more “moderate” than Hillary Clinton, actually has the larger agenda of the two ($287.0 billion vs. $218.2 billion).

Defense-related spending items received the highest proposed spending increases among Republican candidates. Huckabee and Romney, for example, offered $67.2 billion and $40.6 billion, respectively. Among Democrats, Clinton’s biggest boost goes toward health care ($113.6 billion) and Obama’s for economy, transportation, and infrastructure ($105.0 billion).

Two of the eight candidates proposed sufficient spending cuts that more than offset their new spending plans: Rudy Giuliani (-$1.4 billion) and Ron Paul (-$150.1 billion).

Taking into account that Rudy Giuliani has dropped out, this makes Ron Paul the only candidate that actually wants to save the country some money, and he wants to do it on a large scale. Why are we pouring in so much money in taxes, to support a government that doesn’t represent us? Why would you support a candidate that wants that trend to continue? Where is this money going to come from?

Would you rather not pay income taxes? I know I would. What if you got to keep an extra third of your income? Don’t you think that’s a bigger chunk than a proposed $300 tax rebate Congress is now considering to “stimulate” the economy? I know it is significantly more for me. I would rather spend the money that I earned than some small rebate the government gives me. We don’t know the source of that rebate either. The administration is currently engaged in deficit spending, racking up more debt to pass on to future generations. How long until China calls in those debts?

It’s time to make the government stop spending, and the only way to do that is by electing Ron Paul, the only candidate left that is 100% for real about cutting spending and saving US citizens their hard-earned income. Read his views, compare them with an open mind to the other candidates, and you’ll see Dr. Paul is truly about getting our country back on track.

Join the REVOLUTION

MySQL -> CSV

Wednesday, January 30th, 2008

Always on the lookout for increases in efficiency, I love when I find a slick snippet of command line goodness that makes a hard sounding task simple and quick. I was tasked with creating an email list from a database and putting it into a csv format, and had only the command line to interface with the DB. My first attempt revolved around using the SELECT … INTO OUTFILE syntax. Unfortunately, I was unable to write out to a file with the DB user I had access to. What’s a fella to do?

Unix pipes to the rescue.

First, the whole command:

echo “select * from example;” | mysql -u user -p dbname | tr ‘^V^I’ ‘,’ > filename.csv

Let’s break this down, in case Ben is reading and can’t follow along. The echo statement contains your query. It is sent to the mysql command, which connects you to the database and executes the query, returning the data in tab-delimited format to the console. The tr command reads from STDIN, and replaces tabs (Ctrl-V Ctrl-I) with whatever delimiter you want (in this case the comma). The final touch is sending it to a file of your choosing.

*Note* - You actually have to type the Ctrl-V Ctrl-I when entering this command. Copy/paste won’t cut it in the example above.

*Note* - You typically do not want to actually enter your mysql password on the command line, as commands run are typically logged. Omit the password to force mysql to ask for it (it won’t interfere with the query). And if you don’t have your mysql access password protected, WTF? You’re asking for trouble.

So there you have it. Simple, easy to follow, effective. As always, this example can be extended into a variety of different ways. It’s up to you to figure it out (you can, of course, pay me to figure it out).

August 27th Emails

Tuesday, August 14th, 2007

I just received an email about August 27th and the proximity of Mars to the Earth. The email claims that Mars will be close to 34 million miles from Earth, the closest it will get for another 2000 years, a distance which will cause it to appear as bright as the moon in the night sky. While the claim is valid, it is what the email leaves out that makes it annoying: the year. This event occurred in 2003, and while Mars was the 4th brightest object in the sky (behind the sun, moon, and Venus), it was still just a small point of light.

You can read more at NASA’s page concerning the hoax. It is dated from 2005, but explains the email hoax (which has been circulating since 2004) and what is true and misleading about it. So if you get the email, please don’t forward it on. It’s so 2003…

Portland Gets Greener

Sunday, July 29th, 2007

I admit, I was hesitant to believe I was actually going to relieve myself of my car when I moved to Portland. St. Louis definitely requires the car to get to almost anywhere I wanted to go. Plus when I need to haul stuff…you can only carry so much on your back and in your arms.

So it was with great relief that I found Flexcar via EcoGeek. From Flexcar’s site:

Welcome to a new era in personal transportation. It’s called car-sharing and it’s incredibly easy! You share access to hundreds of Flexcar vehicles, often within a five-minute walk of your home or work. You reserve a car online or by phone; you drive - to a meeting, to run errands, or to hit the lumber yard; and you return to the car’s designated parking space, all for one hourly rate that covers gas, premium insurance and 150 free miles. All you pay for is the drive. How simple is that? Plus, Flexcar is convenient, affordable, reliable, and great for the planet. Join the transportation revolution that washingtonpost.com called the “wave of the future.”

I looked at their monthly savings calculator and found the numbers quite pleasant:
My average cost of ownership is about $550 per month, mostly depending on the number of miles (which affects gallons of gas purchased and oil changes needed) I drive per month.

The cost of Flexcar, based on the number of hours per month, broke down like this:
5 hours per/month 45.00
10 hours per/month 85.00
25 hours per/month 200.00
50 hours per/month 375.00

So even at a high volume clip, I’m still saving $200-$300 a month. Plus, if I reduce my driving to almost none (which is my intention), that’s $500 or so a month I get to keep in my account. Granted, there will be other costs, like bus passes and the like, but by and large, I’ll be saving $6,000 a year, and that’s nothing to scoff at.

Eat local

Friday, July 6th, 2007

It’s no secret that to the best fresh fruit and veggies are the ones that travel the least to get to your kitchen. That’s why it is so important to spend your time and money at your area farmer’s markets. The benefits, short and long term, far outweigh the negatives. Sure, I can get corn 4/$1 at my local supermarket, but its pretty blah on its own and needs help. Conversely, locally grown corn purchased at the farmer’s market near me is delicious without any help from seasonings.

With that in mind, it is sometimes hard to know where these purveyors of fresh, local produce can be found. Fortunately there a plenty of websites that can get you started on your search. Two that I have found recently and like the cut of their gib include:

These only open the door to the possibilities near you. One thing to remember…it’s perfectly okay to eat seasonal food in season and leave it when the item is out of season. Abstaining from out-of-season foods will liven up your menu by exposing you to a wider array of choices and who doesn’t like variety? And sure, this all takes effort, but I think this is effort well spent. Enjoy your local markets!

cribbage.erl

Monday, May 28th, 2007

Coupling Cribbage and Erlang into a program sounds like a fun little program to write to aid in learning Erlang while writing a program that brings a game I like in life to the virtual world. Is it the most efficient? Probably not, but you gotta start somewhere. To the code!

The first thing I wanted to do was create a method to calculate points. An ace is a 1, 2-10 are face value, and Jack, Queen, King are 11, 12, 13, respectively. Easy adjustments could be made to allow characters (A, J, Q, K) but for now, I like keeping it simple.

-module(cribbage).
 
-export([points/1]).
 
points([]) -> 0
points(L) ->
    Hand = lists:sort(L),
    fifteens(Hand, 0) + runs(Hand, 0, 0) + pairs(Hand, 1, 0).

The above creates a module called cribbage and exports a function called points/1 which takes one parameter, a list of cards. There are three kinds of scoring in Cribbage: combinations of cards that equal 15, runs of three or more, and pairs (or sets or four of a kind). There is one other kind, but it’s not part of this portion of the game.

cardval(C) when C > 9 -> 10
cardval(C) -> C.
 
fifteens(_L, Total) when Total > 15 -> 0
fifteens(_Hand, Total) when Total =:= 15 -> 2
fifteens([], _Total) -> 0
fifteens([H  T], Total) when Total < 15 ->
    fifteens(T, Total) + fifteens(T, Total + cardval(H)).

cardval is a function that converts the value of face cards (11-J,12-Q,13-K) to 10 and leaves other cards unchanged in value. This is useful in finding all combinations of 15 in the hand. When a combo equals 15, two points are added to the score.

Runs were the trickiest of the three to get right. First, I defined a simple function to determine the points for a run of given length.

run(3) -> 3
run(4) -> 6
run(5) -> 12
run(_Length) -> 0.

Some people may play with different values for runs of different lengths, so this allows for easy editing.

Runs come in two flavors: 1) A normal run, and 2) A run where one or two of the cards are doubled. To account for this, I have runs/3 and runs/4. runs/3 handles the first case, and passes control to runs/4 when a run of the second case is encountered. Another special case is when a run has two different cards doubled (e.g. 3,4,4,5,5) where the run of three is doubled and doubled again.

%% two cases for runs
%%   1. A straight run - 4,5,6,7
%%   2. A run with a double in the sequence - 4,4,5,6 or 4,5,5,6
runs([], _Curr, Len) -> run(Len)
runs([H  T], Curr, Len) when H =:= (Curr+1) -> runs(T, H, Len + 1)
runs([H  T], Curr, Len) when H =:= Curr -> runs(T, Curr, Len, {H, 2})
runs([H  T], _Curr, Len) -> run(Len) + runs(T, H, 1).
 
runs([], _Curr, Len, {_Card, Mult}) -> Mult * run(Len)
runs([H  T], Curr, Len, {Card, Mult}) when H =:= (Curr+1) ->
    runs(T, H, (Len+1), {Card, Mult})
%% needed for special cases where multiple cards are doubled up
%% like 3,4,4,5,5
runs([H  T], Curr, Len, {Card, Mult}) when H =:= Curr, H > Card -> 
    runs(T, Curr, Len, {H, (Mult*2)})
%% handles a triple carding, like 2,2,2,3,4
runs([H  T], Curr, Len, {Card, Mult}) when H =:= Curr ->
    runs(T, Curr, Len, {Card, (Mult+1)})
runs([H  T], _Curr, Len, {_Card, Mult}) ->
    (Mult * run(Len)) + runs(T, H, 1).

For pairs, I do a similar thing: define a pair(Length) function that returns the point value given a number of similar cards. But it’s all pretty straightforward.

pairs([], Pairs, _Curr) -> pair(Pairs)
pairs([H  T], Pairs, Curr) when H =:= Curr -> pairs(T, Pairs+1, Curr)
pairs([H  T], Pairs, _Curr) -> pair(Pairs) + pairs(T, 1, H).
 
pair(2) -> 2
pair(3) -> 6
pair(4) -> 12
pair(_Length) -> 0.

That’s it for now. Actual game play to come. You can get the code here.