The stricture of structure

        Here is a letter that I wrote to Dr. Dobb's Journal. (It was not published.)
        Even without the code I'm commenting on, I think it's pretty clear. Alan
        Wilcox, the author of that code, used a clock-calendar to illustrate his
        principles of structured assembly language programming. I demurred:

                             Jeremiah Avins
                             Address Munged
                             Somewhere, NJ
                                                     July 18, 1985
    The Editor
    Dr. Dobb's Journal

Dear DDJ,

         Alan Wilcox's well-written July '85 article on his clock/
    calendar set me off on an antistructure tantrum, and I want to
    bounce my thoughts off you and your readers. I will use some of
    Wilcox's code as examples, but I wouldn't like to see that taken
    as an attack on him. He just happened to be standing there when I
    ran amok, and I hope he forgives me.

         Mr. Wilcox tells us that he took pains to write his assembly
    code in a modular and structured style, and that it is therefore
    somewhat longer than it might otherwise be. I have no quarrel
    with the modularity; any program is easier to write and to
    understand if it is built up of self-contained logical units.
    "Structure", on the other hand, adds useless bulk to assembly
    routines and often blurs the framework on which the code is hung.

         The canons of structured programming in assembly language
    seem to be:
         (1) Don't branch out of a subroutine.
         (2) Each subroutine may be entered at only one place.
         (3) Each subroutine may have only one return statement.
         (4) The return statement must be the last one.
    These rules are supposed by some to make the program easier to
    read and to understand, even if the processor has to work a
    little harder to execute the code. In fact, to understand the
    code, the programmer and the reader have to emulate the
    processor: when it works harder, they do too. The intent is
    good, but the rules are counterproductive.

        Often, those rules force superfluous branches, thereby
    cluttering the symbol table and forcing the reader to scan for
    the unnecessary label. For example (LISTIT, p. 78):
              ...
              CMP    '$'
              JZ     LISTEND
              ...
              JMP    LISTIT
     LISTEND: RET    ; Last line.
    That is merely gingerbread ornamentation. Compare it to
              ...
              CMP    '$'
              RZ     ; Return from here!
              ...
    which has one fewer opportunity create a label and is three
    bytes shorter.

         The construction
              CALL   FIDO
              RET
    is better replaced by
              JMP    FIDO
    If FIDO doesn't return for us, our goose is cooked anyway, and
    every assembly programmer knows that. Moreover, if we don't
    insist on a salient return, some routines become easier to read.
    Compare (PRNIB, p. 72)
      PRNIB: ANI    0FH
              CMP    10
              JNC    LETR
              ADI    '0'
              JMP    PRNT
      LETR:   ADI    'A'-10
      PRNT:   CALL   PCHAR
              RET
    to
      PRNIB: ANI    0FH
              CMP    10
              JNC    LETR
              ADI    '0'
              JMP    PCHAR
      LETR:   ADI    'A'-10
              JMP    PCHAR
    Isn't the second version easier to follow? Freed from the
    stricture of structure, one can pare it further to
      PRNIB: ANI    0FH ; Get low nibble.
              ADI    '0' ; First make ASCII.
              CMP    '9'+1 ; IF it's a digit
              JNC    PCHAR ; THEN send it
              ADI    'A'-'0' ; ELSE make letter
              JMP    PCHAR ; and send that.
    This is straight in-line code; no labels, no spaghetti jumps. The
    internal JNC works much like a high-level BREAK statement, and
    should be equally respected. (Those who freely write CALL-RET in
    place of JMP will refrain from carping at two ADIs.) The last
    version of PRNIB comes closer to the ideals of structured pro-
    gramming than either of the others, but the rules are freely
    trampled. All in all, it seems clear that structured exits
    clutter the listing, the symbol table, and the mind.

         What about structured entries? I once had to send three
    nulls after carriage return instead of the single null provided
    by the already tight ROM. (Yes, there still are teletypes!)
    There was already a call to "null" used by others:
      NULL:   LDI    0
              JMP    PCHAR
    Only six more bytes did the trick:
      3NULLS: CALL   NULL
              CALL   NULL
      NULL: ; etc.

    There! One subroutine, two entry points, two functions. If it
    is in any way obscure, that is surely becuse of the odd calling
    sequence, not the multiple entries. Many monitors provide a CRLF
    call just ahead of PCHAR, which the code then falls into:
      CRLF:   LDI    0AH
              CALL   PCHAR
              LDI    0DH
      PCHAR: ; etc.
    The AIM-65 monitor goes further with something like
      DBLSP: CALL   CRLF ; doublespace
      CRLF: ; etc.
    This isn't new stuff, but it's not outmoded, either. The extra
    opcodes demanded by structuring don't make the analyst's job any
    easier, they just give him more to think about. Unfortunately,
    with assemblers we sometimes are forced to write spaghetti code.
    Covering it with sauce doesn't make it better.

         We aren't talking here about the use of DO-WHILE or BEGIN-
    UNTIL structures as opposed to a mess of IF ... GOTOs. With
    assemblers we haven't got that luxury. What we're looking at is a
    set of rules that was conceived of in another environment, which
    is at best of doubtful worth for assembly code, but which has been
    advocated (and widely accepted) as the One True Way For Every-
    body. History is full of people and groups who made a lot of
    misery by elevating an answer to The Answer. We seem to be doing
    it again in our own small way.

         I don't want to create the impression that I dislike
    symbols. Mr. Wilcox wisely assigns a symbol (CLOCK) to represent
    the base of his I/O map. The change of a single EQU makes for easy
    relocation. He could have gone further. Additional assignments like
      DAYPORT EQU CLOCK+5
      MINPORT EQU CLOCK+3
    would keep that ease, highlight any bugs in the code, build an
    I/O map into the symbol table, and generally make life easier. Of
    course, if the symbol table is full of dummy targets, there may
    be no room for the more valuable stuff.

         One final quibble: Originally, a real-time clock was one that
    enabled a (usually mini) computer to control events in real time.
    The subject of Wilcox's article is properly called a calendar
    clock, clock-calendar, or some other permutation. The distinction
    was worth keeping, but I'm afraid that by now it's irretrievable.

Sincerely yours,

       Interestingly, I recently needed a routine for a 68HC11 that prints Motorola S-records.
       It needs to print hex octets as two ASCII characters, output control characters, and keep
       track of a check sum. The code to format the records is both proprietary and uninteresting
       here, but the subroutine to send characters to the UART is another (excellent, I boast)
       example of non-spaghetti multiple entries. (Note the BSR OUT_NYB without a matching
       return.) This is it:

* This is the transmit subroutine. The four entry points
* OUT_XXX allow calls with decreased processing.
*   OUT_CHK adds checksum updating to OUT_BYT.
*   OUT_BYT sends an octet in B as two ASCII hex digits,
*   so that '11110111' binary is rendered as 'F7'.
*   OUT_NYB sends a nibble in A as a single hex digit.
*   OUT_RAW sends an octet in A unaltered.
* Upon return, B is unaltered, and A contains the last
* character sent.

OUT_CHK TBA           ; This updates the checksum,
        ADDA CHK_SUM ; but doesn't manage it.
        STAA CHK_SUM
OUT_BYT TBA
        LSRA          ; Shift high nibble low.
        LSRA
        LSRA
        LSRA
        BSR OUT_NYB ; Send it.
        TBA
        ANDA #$0F     ; Keep the low nibble.
OUT_NYB CMPA #10      ; Bigger than 9?
        BLO DIGIT    ; If not, it's a letter, so
        ADDA #7       ; skip over ;, :, <, =, >, ?, and @
DIGIT   ADDA #$30     ; Adjust the stick.
OUT_RAW BRCLR SCSR,X,$80,* ; Wait here till the UART is ready,
        STAA SCDAT,X ; then get on with it.
        RTS

I don't believe that it could be much more compact or straightforward. In my view, that means
it couldn't be rewritten to make it much easier to understand. To me, this is good structure.

Copyright © 2001, 2007 by Jerry Avins

Return Home