Welcome to Duncan White's Practical Software Development (PSD) Pages.
I'm Duncan White, an experienced and professional programmer, and have been programming for well over 30 years, mainly in C and Perl, although I know many other languages. In that time, despite my best intentions:-), I just can't help learning a thing or two about the practical matters of designing, programming, testing, debugging, running projects etc. Back in 2007, I thought I'd start writing an occasional series of articles, book reviews, more general thoughts etc, all focussing on software development without all the guff.
![]()
Every Abstract Data Type (ADT) needs a stringify operator.
Here's a practical tip that I've come up with, something I've been doing consistently for over 15 years.
- When you design an Abstract Data Type (whether it's a module, a class, a package, or whatever terminology your language of choice calls it) - as well as writing and testing all the ADT operators required by the ADT, there is one more operator that you should always write and test:
The "stringify" operator - turn your ADT into a printable string. It doesn't matter whether you like to call your stringify operator display_as_str, as_str, as_string, to_string, to_s etc. I'll call it "as_str" for brevity.
This converts the ADT object into a neatly formatted printable string form, suitable for printing, displaying in a GUI dialog box, logging to a log file - whatever you need.
Even if you're sure you'll never need it - create it anyway, spend enough time on it (usually only a few minutes) to make it produce the cleanest, simplest and most beautiful output you can. And then test it. Thoroughly.
For some reason I've never seen this principle clearly articulated in any of the programming books I've read. Surely this idea can't be original to me? More recently I note that a few languages - most obviously Ruby - give every class a "to_s" method to convert the object to a string, which is excellent. But I say: do this in any language, whether or not the language supports it!
- But isn't this unnecessary work? No! You will need this operator one day, trust me on this! When your program is failing to work properly, everyone is screaming at you, you're panicking really badly, everything you check via the debugger or debugging statements looks fine, and you just don't understand where the problem can be, you'll think:
Dammit! If only I could print that ADT object out right in the middle of my code, it'd be obvious whether the ADT contents are correct - pointing me either at the half of the program where I build the ADT, or the other half where I use it. Then, if it's the first half, I could print out the ADT as I build it and find out which stage went wrong.
- Psychologically, however, almost no-one will ever take time out from frantic debugging while under pressure in order to write - and debug - a new bit of code. So: make sure you have the as_str routine there already for use, tested thoroughly - like all your other code, of course - and one day you'll really thank me for giving you a paddle when you're on a certain river... I regard it as a type of scaffolding, or an essential overhead - a tiny overhead on the cost of design and programming.
- Of course, some languages make string handling - especially with arbitrary length strings - so difficult (eg. C - although you can write a malloc-based arbitrary length string library easily enough) that you may find it easier to replace the as_str operator with a print_to_file operator. This is less flexible than as_str - in particular, you lose the ability to display your ADT in a GUI dialog box - but it's still a lot better than not having either routine.
- Having said that you may think you'll never need this operator, in fact the as_str operator is incredibly useful in testing. More on that in article 5.
- As a cute extra feature, in some languages (eg. Perl, C++), you can even use operator overloading to make your as_str operator get invoked automatically in certain contexts. This isn't essential, but can let you focus on the problem rather than the mechanics of getting the print statement correct:-)
Now, I think that all needs an example - don't you?
Practical Perl Example of as_str:
- Suppose we create an incredibly simple (X,Y) coordinate class, file Coord.pm:
package Coord; # a Coord is a coordinate (X,Y) that knows how to print itself use strict; # Constructor: c = new Coord(X,Y) sub new ($$$) { my( $class, $x, $y ) = @_; return bless { X => $x, Y => $y }, $class; } # str = coordinate_object->as_str: produce printable string form (x,y) sub as_str ($) { my( $self ) = @_; my $x = $self->{X}; my $y = $self->{Y}; return "($x,$y)"; } # ...other operators go here 1; # perl modules must "succeed". stupid but necessary.- Given this code, a test program can write:
use strict; use Coord; my $c = new Coord( 3, 4 ); my $cstr = $c->as_str; print "printable form of x=3,y=4 Coord is $cstr\n";When run, unsurprisingly this produces:
printable form of x=3,y=4 Coord is (3,4)- One could do exactly the same in C, C++ or Java. Maybe I'll present some other language versions of this example in the near future. Doing this in OO form in C would require the well-known "object simulation" technique - where an object = pointer to structure containing instance variables and method-pointers-to-functions. Doing this in C - whether in OO form or not - would also require thinking about memory allocation for the
as_str
result.- Back to Perl: by adding the following single line to Coord.pm:
use overload '""' => 'as_str';the body of our test program becomes:
my $c = new Coord( 3, 4 ); print "printable form of x=3,y=4 Coord is $c\n";- For people who don't know much Perl, perhaps I should explain Perl's "variable interpolation in quoted strings" feature - when we write a string as in
print "x=$x, y=$y\n"
the current values of$x
and$y
are interpolated into the string in the appropriate positions. Thus, the above is the equivalent of C'sprintf("x=%f, y=%f\n", x, y)
- assuming that x and y are floating point numbers.Once we have the overloading feature enabled, the use of
$c
in the quoted string causes Perl's interpolation mechanism to call the$c->as_str()
method in order to generate the string literal form of$c
. i.e. the mere act of using$c
in a string-ey context causes theas_str
method function to be called.Let's call this string overloading technique stringification.
- Even better, this technique works well across multiple classes - suppose you want to work with lists of Coords, and write a generic List class (file List.pm) that simply wraps a Perl array in an OO wrapper, adding a stringification method:
package List; # a List (array) of things that know how to print themselves. use strict; use overload '""' => 'as_str'; # Constructor: list = new List( array ): sub new ($@) { my( $class, @array ) = @_; my $self = bless \@array, $class; return $self; } # str = list_obj->as_str: produce printable string [element,element,...] sub as_str ($) { my( $self ) = @_; my $commasep = join( ',', map { "$_" } @$self ); return "[$commasep]"; } # etc- Here, the line performing the bulk of the List stringification is:
my $commasep = join( ',', map { "$_" } @$self );This single line means:
- Take the array to which
$self
refers:@$self
.- Stringify each element of the array - the
map { OP } ARRAY
operator applies an operation (OP) to each element of an array, producing an array of the operation results. Amap
operation OP is simply an expression involving$_
- which represents the array element. So, for example,my @x = map { $_ * 3 } @y
would create@x
- an array the same size as@y
, in which each element is triple the corresponding element in@y
.- In our case,
"$_"
simply evaluates each array element in a string context, forcing (in an object with overloaded stringification set) theas_str
stringification method to be called. We could have written that as$_->as_str
..- Finally, the
join(SEP, ARRAY)
operator joins each array element together separated by the given separator.If you really can't grok this idiomatic Perl code, it's the exact equivalent of the following dull but worthy (Wirth-y?) code, using the
.=
string append operator:my $commasep = ""; foreach my $element (@$self) { $commasep .= $element->as_str; $commasep .= ','; } chop $commasep; # remove trailing comma! ick!- Now, using our List class, we can write a test program:
my $list = new List( new Coord(3,4), new Coord(0,7), new Coord(2,1) ); print "printable form of list is $list\n";When you run this example, you get:
printable form of list is [(3,4),(0,7),(2,1)]- Isn't this cute?.. and damn useful! It works simply and robustly on any combination of arbitrarily complicated components - unless your data structures contain loops (graphs etc) in which case a bit of thought is needed to ensure that the as_str functions don't start recursing indefinitely. However, in Perl, circular data structures are extremely rare.
- An obvious addition to the as_str method - which can be used to solve the circular display problem - is to provide a per-object (or per-class) "display mode" method which effects how as_str displays the object, i.e. to provide several alternative display formats. For example, a "terse display" mode for
List
could simply report how many elements are contained, omitting all information about what elements are contained. Or the List's terse display mode could force all it's members into terse mode so that they display only a unique identifier like their name, key value etc.
d.white@imperial.ac.uk Back to my Practical Software Development Top Page. Updated: 8th February 2011