PSD: Duncan White's Practical Software Development Pages

Welcome to Duncan White's Practical Software Development (PSD) Pages.

I'm Duncan White, an experienced and professional programmer, and have been programming for well over 20 years, mainly in C and Perl, although I know many other languages. In that time, despite my best intentions:-), I just can't help learning a thing or two about the practical matters of designing, programming, testing, debugging, running projects etc. Back in 2007, I thought I'd start writing an occasional series of articles, book reviews, more general thoughts etc, all focussing on software development without all the guff.

See all my Practical Software Development (PSD) Pages

Testing and Development part 4

In the First Part of this article I discussed testing your programs, an old field newly reinvigorated by the recent strong emphasis on testing in Extreme Programming (XP) and it's spin-off Test Driven Development (TDD)

In the Second Part of this article, we started working through an example of development and testing a stack of elements, in C, and developed a stack that could successively push a single integer onto a stack, leaving a non-empty depth 1 stack whose top value was the value we pushed.

In the Third Part of this article, we extended our stack to one that appeared to work, at least passing all push/pop tests up to a depth of 3, finding and fixing bugs along the way. However, we discovered that manually inserting the correct check calls into these tests became an increasing chore.

Now, in the final part, let's try a more sophisticated testing approach instead - starting with a seemingly-unrelated observation:

Generating a Printable form of a Stack
In my very first PSD article I argued that every ADT needs a display_as_str operation. Although you may not immediately see what this has to do with testing, let's add this to the stack interface stack.h and see what happens:
   extern string stack_as_str(stack);/* return displayable form of stack */
                                     /* NOTE: caller must free returned string */
But what type is a string? In many languages, this isn't an issue, but manual storage allocation languages (like C) make us think about the storage allocation issues of storing a string. For now let's add:
  typedef char *string;
to the top of stack.h
Now, what format would we like stack_as_str(stack) to generate? Something nice, clear and predictable, like:
    [30,20,10]
representing a 3-element stack with 30 on top, 20 under it, and 10 at the bottom.
Ok, now we must implement stack_as_str(stack). A tricky problem (in C) is how to determine the size of string buffer that can safely contain the printable form of our given stack. It is perfectly possible to calculate the exact length:

First iterate over the stack once, counting how many characters we would store into a string (including how many characters each decimal value will occupy).

Next, allocate the actual buffer with malloc(storedchars+1).

Finally, iterate over the stack a second time writing into the buffer.

However, this is quite a pain.

A more general solution would be to implement a minimal dynamically growable string type which only needs to support repeated strcat() style appending to a string, reallocating a bigger string buffer whenever we need more space. However, that would be as complex - or more complex - than the stack we're trying to test, so would take us too far afield now.
Instead, we'll use an age old (but unimpressive) technique: We'll malloc() a ridiculously large string buffer, much bigger than the printable form of most reasonably sized stacks, write the formatted stack string in that buffer (checking that it fits, even so) and return it to the caller - who is then responsible for free()ing the returned string:
   #include <string.h>
   #include <assert.h>
   ...
   #define BIGSTRLEN 1024
   string stack_as_str(stack s)       /* return displayable form of s    */
   {
      char *d = malloc(BIGSTRLEN);    /* ridiculously big! */
      int len=1;
      strcpy( d, "[" );
      int i;
      for( i=s->nel-1; i>=0; i-- )
      {
         char *p = d+strlen(d);
         len += sprintf( p, "%d,", s->data[i] );
	 assert( len < BIGSTRLEN );
      }
      if( s->nel > 0 )
      {
         d[strlen(d)-1]='\0';         /* remove trailing comma */
	 len--;
      }
      strcat( d, "]" );		      /* append ']' */
      len+=2;
      assert( len < BIGSTRLEN );
      return d;			      /* CALLER MUST FREE THIS STRING */
   }
One application of stack_as_str(stack) is in testing: One of the simplest ways to check that an ADT object is correct after a series of operations is to generate the as_str form of the final object and then compare the actual string result against the known correct (hand-built) printable form. If you have chosen a simple and intuitive display format, this can be almost no work.
Back with our stack ADT, given the partial test case:
   stack s = empty_stack();
   push( 20, s );
   push( 30, s );
Obviously, the correct answer is the 2-element stack with 30 on top and 20 underneath: [30,20].
So let's add the following check_stack(stack,answer) function to the stack module, which checks a stack by turning it into a printable string and comparing that string to the correct answer (note that CHECK_STR has been in check.h all along):
   #include "check.h"
   ...
   void check_stack( stack s, string answer )
   {
      string as_str = stack_as_str(s);
      CHECK_STR( as_str, answer );
      free( as_str );
   }
In stack.h we add:
   extern void check_stack( stack s, char *correct );  /* check stack contents */
By the way, Makefile dependencies are important - don't forget at this point to add check.h to stack.o's dependencies in the Makefile (to reflect the changed #include structure) so that stack.c will be recompiled whenever we change check.h.
Using check_stack() our test5 becomes:
   stack s = empty_stack();     check_stack( s, "[]" );
   push( 20, s );               check_stack( s, "[20]" );
   push( 30, s );               check_stack( s, "[30,20]" );
   push( 45, s );               check_stack( s, "[45,30,20]" );
   CHECK_ELEMENT( pop(s), 45 ); check_stack( s, "[30,20]" );
   CHECK_ELEMENT( pop(s), 30 ); check_stack( s, "[20]" );
   CHECK_ELEMENT( pop(s), 20 ); check_stack( s, "[]" );
Download this version here, Pleasingly, this works fine (first time, too). Of course, this does not test top(), depth() and isempty() as thoroughly as before: but it's much easier to generate the tests. We could test top() easily enough, by adding:
   CHECK_ELEMENT( top(s), N );
after each push() and pop() [except the last].
Meta-tests: designing a Little Language

Let me leave you with a final thought: applying my core principle of ruthless automation even further, we could autogenerate such stack tests - and their correct answers - by designing a little language which defined a sequence of operations to perform, and a trivial "interpreter" in a scripting language like Perl, which does three things:

Simulate a stack, obeying the instructions one by one and hence calculating the correct checks and answers at each stage.

writes the corresponding C test program for us, embedding the correct checks as it goes! This is an example of the Pragmatic Programmers' advice to write Programs that write Programs. (see my Review of the Pragmatic Programmer here).

compiles and executes the automatically generated C test program for us, telling us if any fail!

Suppose we choose to define the string:
	20 30 -
to represent the sequence of stack operations and checks:

start with an empty stack, check that isempty() is true, and the printable form of the stack is [].

'20': push 20 on, then check that the stack has printable form [20].

'30': push 30 on, check the stack is now [30,20].

'-': pop off a value, check that it's correct (30) and the remaining stack is [20].

In this notation, our whole test5 test case is equivalent to:
	20 30 45 - - -
This would be trivial to write (in Perl) - half an hour's work or less, by my estimation! Maybe I'll cover this in a future "little language" article.. watch this space!
Summary: Give TDD a try!
So, what has this article shown? I think TDD is a very interesting approach, I like the emphasis on strong and automatic testing and I really love the idea of building a test, expecting your code to fail, running the test, confirming that your code fails the test, and only then trying to write the missing functionality. This gives a really strong focus to your development - you're not designing and implementing thousands of lines of code that you think you may need later, instead you're writing exactly what you need to make your latest test pass.
By the way - if you were wrong and your code already passes the new test, this is valuable (and surprising) information that you'd want to investigate carefully. How would you ever know this without TDD?
There is the question: when do you stop developing in TDD? - after all, you can always write more tests. Presumably you have some higher level goal (specification) in mind as well, and you can therefore tell once you've added all the features required by the spec and tested that they work by TDD. Of course, the flexibility of TDD allows you to change your mind months or years later and add some new features, driven by new TDD test cases.
You wouldn't want to stop developing before you had good test coverage - until you are convinced that your tests are, in principle, deep and thorough enough to cover all special cases in your code (bearing in mind that no amount of testing can prove correctness).
Another question emerged in Part 3 when we noticed that we only tested stacks up to depth 1, and then deliberately built a "stack of one element" that minimally satisfies (satisfices) our test suite. We did this primarily to point out the incompleteness of our test suite. But at that point, I observed:
This is one thing that concerns me about TDD - if the tests are actually driving development, it's arguable that the joke stack of one element is a correct solution at this point.

Then I concluded:
However, I hope that noone would ever write such a stupid implementation, even as a stepping stone - except to make a point.
Is this conclusion safe to make? I think so, because it seems obvious that a "stack of elements" must store a collection of many elements. Alternatively, one could argue that it doesn't matter if a programmer initially implements a "stack of one", because later they will write a "2 deep stack" test, see it fail and then generalise.
Finally, this last part of this article points out a surprising way in which the argument from my very first PSD article (every ADT needs a display_as_str operation) is extremely useful in TDD. As I said above:
One of the simplest ways to check that an ADT object is correct after a series of operations is to generate the as_str form of the final object and then compare the actual string result against the known correct (hand-built) printable form.

Thanks for reading this far! If you have any comments, I'm very interested in hearing from you..

d.white@imperial.ac.uk
Back to part 3 of this article Back to PSD Top
Written: March-June 2013