Code Critique: Line 255 of Sea and Spar Between

nickm · January 2020

Software: Sea and Spar Between (digital poetry)
Authors: Nick Montfort and Stephanie Strickland
Language: HTML/CSS/JavaScript, prototyped in Python
Year: 2010, cut to fit the toolspun course edition 2013
Source file: https://nickm.com/montfort_strickland/sea_and_spar_between/sea_spar.js
“How to Read SaSB” page: https://nickm.com/montfort_strickland/sea_and_spar_between/reading.html
Blog post about bug fix: https://nickm.com/post/2020/01/sea-and-spar-between-1-0-1/

Stephanie Strickland and I wrote of a specific part of the Sea and Spar Between code: “The following syllables, which were commonly used as words by either Melville or Dickinson, are combined by the generator into compound words.” This particular way of conflating Melville’s language with Dickinson’s was important to us. However, due to a programming error, it wasn’t being done. What was line 255 in version 1:

syllable.concat(melvilleSyllable);

does not accomplish the purpose of adding the Melville one-syllable words to the variable syllable, which holds an array of Dickinson’s words at this point. The concatenation happens, but the concat() method does not change the syllable array in place. The resulting longer list is simply thrown away. This line has been changed in version 1.0.1, published yesterday. The relevant line, because of the addition of explanatory comments, is now line 286:

syllable = syllable.concat(melvilleSyllable);

I noticed this omission myself only years after the 2013 publication of cut to fit the toolspun course, a richly commented edition of Sea and Spar Between. As a result of my mistake, the compound or kenning “toolspun,” used in the title of that work, never was actually produced in any existing version of Sea and Spar Between. This was a frustrating situation, but after Stephanie and I discussed it briefly, we decided that we would wait to consider releasing an updated version until this defect was discovered by someone else, such as a critic or translator.

The system has been translated to Polish and has been “remixed” once that we know of. The ELMCIP database lists 20 references to Sea and Spar Between in critical writing.

The defect was only discovered recently by a critic, Aaron Pinnix, a Fordham PhD student doing a dissertation on oceanic literary works.

The cut to fit the toolspun course edition of the project (incorrectly given the title “cut to fit the tool-spun course” in the journal Digital Humanities Quarterly when it was published) is less than 1000 lines long, not lengthy for an academic paper on electronic literature. This may be the most detailed discussion of a digital literary system’s code by the authors of that system.

Yet Pinnix discovered the mistake simply by carefully reading the system’s output and carefully reading statements that Strickland and I made about how the system was supposed to work; there was no tracing through the code and no CCS analysis done. It seems that a thorough and attentive traditional reading, in this case, led to the most complete understanding of how this system works so far.

barry.rountree · February 2020

Perhaps another bug....

When I navigate to 0,0 using the text box I reach "circle on/but artless is the earth." When I hit the up arrow, I'm transported to coordinate 245:297, or 237:295, or 242:295, or.... After that, arrow navigation behaves as expected, incrementing or decrementing by one. Navigating back to 0,0 via the textbox allows the unexpected behavior to recur.

Navigating to 1,1 or 1,0 or 1000000,0 shows similar behavior.

Was that intentional? Skimming the source I don't see any obvious reason why that should happen.

jang · February 2020

There's a similar issue with

var dickinsonFlatLessLess = dickinsonLessLess[0];
dickinsonFlatLessLess.concat(dickinsonLessLess[1], dickinsonLessLess[2]);

which explains the misbehaviour of riseAndGoLine. However, the following comments:

// The function riseAndGoLine can generate, e.g., "graspless dance and go":

and

// While the previous function does produce such lines, it does not work as
// first intended or as a quick reading of the code might suggest. An
// examination of the code above suggests that it will produce the line
// "graspless dance and go --" (with a dash at the end), but it does not,
// because the condition on the if statement is never true. A similar
// condition works in Python, but not in this programming language,
// JavaScript.
//
// This mistake came about because the generator was originally written in
// Python and converted to JavaScript. The program is still suitable; we
// were pleased with the output that lacked the final dash. Our mistake in
// leaving these lines in place, however, makes detailed understanding more
// difficult for those who might seek to modify and build on this code. At
// the same time, it shows that even fairly short programs can definitely
// retain traces of their making.

seems backwards, because this looks to produce nothing but dashed entries:

    ...
    a = n % dickinsonFlatLessLess.length;
    if (dickinsonLessLess[0].indexOf(dickinsonFlatLessLess[a]) > -1) {
        dash = ' --';
    }
    return dickinsonFlatLessLess[a] + 'less ' + upVerb[b] + ' and ' +
        upVerb[c] + dash;
}

I think the intention, reading the code as written, is that single-syllable "fooless leave and exit" be followed by the dash, but multi-syllabic "foobarless leave and exit" not be.

(Because dicksonFlatLessLess only contains single-syllable words, the dash always gets appended.)

It looks like a bug was discovered, and perhaps the code revised - or, the author of the comment simply got it backwards after spotting the error and looking to document it?

This illustrates the source of the caution many people feel about comments. The comment above the function doesn't say the same thing as the one at the bottom - it's impossible here to determine what the intention was - whether the first comment should have been

// The function riseAndGoLine can generate, e.g., "graspless dance and go --":

and the code is right, or the first comment is right and the code is backwards, or the first comment is merely illustrative and the code is canonical.

There's some merit to having source-code control systems. By reading back through a set of changes, there's more of a chance that an archaeologist might work out the original intention. As it is, we have a v1 and a v1.0.1, and both of them carry the same comments.

Incidentally, this sort of confusion (both in the mind of the reader and the author) has a common analogue in code. It's unfortunately all-too-common to see code that reads like this:

var dontOmitExtraWidget = getEnv("OMIT_EXTRA_WIDGET", default=true)
...
if not dontOmitExtraWidget {
   // should we emit the widget here? or omit it? Gah.

this produces confusion enough in the minds of native English speakers; but I've seen bilingual colleagues trounced further by this kind of cruelty. "I don't know half of you half as well as I should like; and I like less than half of you half as well as you deserve," indeed.

nickm · February 2020

Thanks to both of you, @barry.rountree and @jang. The best reply I can make is probably the one I offer here: https://nickm.com/post/2020/02/sea-and-spar-between-1-0-2/

barry.rountree · February 2020

@nickm Excellent! Glad that was helpful, and yes, I'm no longer seeing that interface issue.

jeremydouglass · February 2020

@nickm said:
Yet Pinnix discovered the mistake simply by carefully reading the system’s output and carefully reading statements that Strickland and I made about how the system was supposed to work; there was no tracing through the code and no CCS analysis done. It seems that a thorough and attentive traditional reading, in this case, led to the most complete understanding of how this system works so far.

I wonder what the types / genres of software are where this would be possible -- where there exists such a detailed description of the intended output that someone can carefully audit the output just to see if it matches that description.

In addition to unit-based language generators... perhaps code generators? Software that generate information streams conforming to a protocol?

In some circumstances unit testing (and test driven development) could serve as that description and analysis process, combined. Then The function riseAndGoLine can generate, e.g., "graspless dance and go --" would be written as an assertion, and changing the code in a way that broke the assertion would be a regression.

jang · February 2020

I wonder what the types / genres of software are where this would be possible -- where there exists such a detailed description of the intended output that someone can carefully audit the output just to see if it matches that description.

A surprising amount of fault analysis begins like this. As systems become more complex, honing in on the specifics of a broken behaviour can be an incredibly useful first step, before diving into code reading.

(Much cryptanalysis begins like this, too: examining supposedly "random" outputs for bias.)

You might be particularly interested in an assignment given by Shriram Krishnamurthi a few years ago as part of a course on programming language implementation.

A number of interpreter implementations are supplied (as black boxes), each of which contains a subtle bug that arises from a different choice taken during the implementation.

The assignment challenges students to demonstrate small input programs in the interpreted language that differ in their outputs from the canonical version that implements the spec precisely.

The mental model here requires asking (a) what was intended; (b) how might one plausibly implement the required behaviour; (c) what are obvious (or less obvious) mistakes that might be made; and finally, (d) how would those mistakes surface during execution?

As an intellectual challenge, it's quite a fun and engaging approach. My own opinion is that the pedagogical value of following that process is pretty high.

Howdy, Stranger!

Categories

In this Discussion

Code Critique: Line 255 of Sea and Spar Between

Comments