Jeremiah Avins
Address Munged
Somewhere, NJ
July 18, 1985
The Editor
Dr. Dobb's Journal
Dear DDJ,
Alan Wilcox's well-written
July '85 article on his clock/
calendar set me off on an antistructure tantrum,
and I want to
bounce my thoughts off you and your readers.
I will use some of
Wilcox's code as examples, but I wouldn't like
to see that taken
as an attack on him. He just happened to be
standing there when I
ran amok, and I hope he forgives me.
Mr. Wilcox tells
us that he took pains to write his assembly
code in a modular and structured style, and
that it is therefore
somewhat longer than it might otherwise be.
I have no quarrel
with the modularity; any program is easier to
write and to
understand if it is built up of self-contained
logical units.
"Structure", on the other hand, adds useless
bulk to assembly
routines and often blurs the framework on which
the code is hung.
The canons of structured
programming in assembly language
seem to be:
(1) Don't branch
out of a subroutine.
(2) Each subroutine
may be entered at only one place.
(3) Each subroutine
may have only one return statement.
(4) The return
statement must be the last one.
These rules are supposed by some to make the
program easier to
read and to understand, even if the processor
has to work a
little harder to execute the code. In
fact, to understand the
code, the programmer and the reader have to
emulate the
processor: when it works harder, they do too.
The intent is
good, but the rules are counterproductive.
Often, those rules force
superfluous branches, thereby
cluttering the symbol table and forcing the
reader to scan for
the unnecessary label. For example (LISTIT,
p. 78):
...
CMP '$'
JZ LISTEND
...
JMP LISTIT
LISTEND: RET ; Last
line.
That is merely gingerbread ornamentation.
Compare it to
...
CMP '$'
RZ ; Return from here!
...
which has one fewer opportunity create a label
and is three
bytes shorter.
The construction
CALL FIDO
RET
is better replaced by
JMP FIDO
If FIDO doesn't return for us, our goose is
cooked anyway, and
every assembly programmer knows that.
Moreover, if we don't
insist on a salient return, some routines become
easier to read.
Compare (PRNIB, p. 72)
PRNIB: ANI
0FH
CMP 10
JNC LETR
ADI '0'
JMP PRNT
LETR: ADI
'A'-10
PRNT: CALL
PCHAR
RET
to
PRNIB: ANI
0FH
CMP 10
JNC LETR
ADI '0'
JMP PCHAR
LETR: ADI
'A'-10
JMP PCHAR
Isn't the second version easier to follow?
Freed from the
stricture of structure, one can pare it further
to
PRNIB: ANI
0FH ; Get low nibble.
ADI '0' ; First make ASCII.
CMP '9'+1 ; IF it's a digit
JNC PCHAR ; THEN send it
ADI 'A'-'0' ; ELSE make letter
JMP PCHAR ; and send that.
This is straight in-line code; no labels, no
spaghetti jumps. The
internal JNC works much like a high-level BREAK
statement, and
should be equally respected. (Those who
freely write CALL-RET in
place of JMP will refrain from carping at two
ADIs.) The last
version of PRNIB comes closer to the ideals
of structured pro-
gramming than either of the others, but the
rules are freely
trampled. All in all, it seems clear that
structured exits
clutter the listing, the symbol table, and the
mind.
What about structured
entries? I once had to send three
nulls after carriage return instead of the single
null provided
by the already tight ROM. (Yes, there
still are teletypes!)
There was already a call to "null" used by others:
NULL: LDI
0
JMP PCHAR
Only six more bytes did the trick:
3NULLS: CALL NULL
CALL NULL
NULL: ; etc.
There! One subroutine, two entry points,
two functions. If it
is in any way obscure, that is surely becuse
of the odd calling
sequence, not the multiple entries. Many monitors
provide a CRLF
call just ahead of PCHAR, which the code then
falls into:
CRLF: LDI
0AH
CALL PCHAR
LDI 0DH
PCHAR: ; etc.
The AIM-65 monitor goes further with something
like
DBLSP: CALL CRLF
; doublespace
CRLF: ; etc.
This isn't new stuff, but it's not outmoded,
either. The extra
opcodes demanded by structuring don't make the
analyst's job any
easier, they just give him more to think about.
Unfortunately,
with assemblers we sometimes are forced to write
spaghetti code.
Covering it with sauce doesn't make it better.
We aren't talking
here about the use of DO-WHILE or BEGIN-
UNTIL structures as opposed to a mess of IF
... GOTOs. With
assemblers we haven't got that luxury.
What we're looking at is a
set of rules that was conceived of in another
environment, which
is at best of doubtful worth for assembly code,
but which has been
advocated (and widely accepted) as the One True
Way For Every-
body. History is full of people and groups
who made a lot of
misery by elevating an answer to The Answer.
We seem to be doing
it again in our own small way.
I don't want to
create the impression that I dislike
symbols. Mr. Wilcox wisely assigns a symbol
(CLOCK) to represent
the base of his I/O map. The change of
a single EQU makes for easy
relocation. He could have gone further.
Additional assignments like
DAYPORT EQU CLOCK+5
MINPORT EQU CLOCK+3
would keep that ease, highlight any bugs in
the code, build an
I/O map into the symbol table, and generally
make life easier. Of
course, if the symbol table is full of dummy
targets, there may
be no room for the more valuable stuff.
One final quibble:
Originally, a real-time clock was one that
enabled a (usually mini) computer to control
events in real time.
The subject of Wilcox's article is properly
called a calendar
clock, clock-calendar, or some other permutation.
The distinction
was worth keeping, but I'm afraid that by now
it's irretrievable.
Sincerely yours,
Interestingly, I recently needed
a routine for a 68HC11 that prints Motorola S-records.
It needs to print hex octets as
two ASCII characters, output control characters, and keep
track of a check sum. The code
to format the records is both proprietary and uninteresting
here, but the subroutine to send
characters to the UART is another (excellent, I boast)
example of non-spaghetti multiple
entries. (Note the BSR OUT_NYB without a matching
return.) This is it:
* This is the transmit subroutine. The four entry points
* OUT_XXX allow calls with decreased processing.
* OUT_CHK adds checksum updating to OUT_BYT.
* OUT_BYT sends an octet in B as two ASCII hex digits,
* so that '11110111' binary is rendered as 'F7'.
* OUT_NYB sends a nibble in A as a single hex digit.
* OUT_RAW sends an octet in A unaltered.
* Upon return, B is unaltered, and A contains the last
* character sent.
OUT_CHK TBA
; This updates the checksum,
ADDA CHK_SUM ;
but doesn't manage it.
STAA CHK_SUM
OUT_BYT TBA
LSRA
; Shift high nibble low.
LSRA
LSRA
LSRA
BSR OUT_NYB
; Send it.
TBA
ANDA #$0F
; Keep the low nibble.
OUT_NYB CMPA #10 ; Bigger than 9?
BLO DIGIT
; If not, it's a letter, so
ADDA #7
; skip over ;, :, <, =, >, ?, and @
DIGIT ADDA #$30 ; Adjust the
stick.
OUT_RAW BRCLR SCSR,X,$80,* ; Wait here till the UART is ready,
STAA SCDAT,X ;
then get on with it.
RTS
I don't believe that it could be much more compact or straightforward.
In my view, that means
it couldn't be rewritten to make it much easier to understand. To me,
this is good structure.
Copyright © 2001, 2007 by Jerry Avins