LLM Writes The Chinese Room (Code Critique)

markcmarino · February 16

Title: "Chinese Room Conundrum"
Author/s: ChatGPT
Language/s: Inform 7
Year/s of development: 2023
Play the game here: https://markcmarino.com/inform/
Code by: ChatGPT-4
Adapted from this article.

Here was my prompt:

Compose a short 1-turn interactive story in the programming language Inform 7 that uses John Searle’s Chinese Room as a setting. The piece should have five different endings. The theme should be machine translation and black boxes, explainability, and AI. The tone should be satirical. Make sure the code will parse.

In the blink of an AI, ChatGPT (4) spat out the code for a complete little interactive fiction; however, the code was invalid in ways that would elude a novice Inform programmer like me. For example, the original code included instructions like, “Instead of talking to John Searle, say,” which should give a response, or should say whatever follows, if the player types the command “Talk to Searle.” However, since the code the LLM produced had not yet defined “talking to,” it would not compile. Inform has the commands “ask,” “tell,” but not “talk to.” Since I did not know that this code needed to be defined, I too was at a loss. The code was, for all intents and purposes, Chinese.

But with the help of a near native speaker (thanks, Ryan Veeder), I got the code up and running. Here’s the cleaned up version below.

"Chinese Room Conundrum" by ChatGPT

The Chinese Room is a room.

John Searle is a man in the Chinese Room. The description of John Searle is "John looks perplexed. He's surrounded by Chinese symbols and an instruction book titled 'Understanding Chinese'. He doesn't understand a word of it."

A man can be consulted or not consulted.
John Searle is not consulted.

A mysterious black box is in the Chinese Room. The description of the black box is "It's a black box. No one knows what's inside, but it's labeled 'Machine Translation 5000'."

Instead of examining the black box:
 say "The box hums lightly, almost as if it's mocking John's futile attempts to understand Chinese.";

Talking to is an action applying to one visible thing. Understand "talk to [someone]" or “converse with [someone]” as talking to.

Instead of talking to John Searle:
 say "John sighs, 'I'm just following instructions from this book. I don't understand Chinese, but the instructions help me respond correctly.'";
 now John Searle is consulted.

A button is a device in the Chinese Room. The printed name of the button is "a button labeled Translate". It is switched off. 

Instead of pushing the button:
    say "The room is filled with a mechanical humming. Moments later, a paper rolls out of the black box.";
    move the paper to the Chinese Room.

The paper is a thing. The paper can be read. The description of the paper is "It's a paper with some Chinese characters, below which are their English translations."

Reading is an action applying to one visible thing. Understand "read [something]" as examining;

Instead of examining the paper:
 if the paper is read:
  say "The translations are hilariously wrong. One sentence reads, 'Fish flies over the moon tonight.' Another says, 'John's teapot is the father of dragons.'";
 otherwise:
  say "You chuckle again at the absurd translations.";
  now the paper is read;

Every turn:
 if the turn count is 3:
  say "Suddenly, the black box starts to shake.";
  end the story saying "The box exploded. So much for 'advanced' technology!";
 if the turn count is 4:
  say "A voice emanates from the box: 'Error 404: Chinese not found.'";
  end the story saying "Looks like the box wasn't as smart as it thought.";
 if the paper is read:
  if the turn count is 5:
   say "John breaks into laughter, 'Even with this book, I'm sure I'd make more sense than that box!'";
   end the story saying "A human victory, for now.";
 if John Searle is consulted:
  if the turn count is 5:
   say "John ponders, 'If the box is just following its instructions and I'm just following mine, who's really understanding anything?'";
   end the story saying "Deep thoughts in the Chinese Room.";
 if the turn count is 6:
  say "The box suddenly goes silent. John shrugs, 'Guess it's just another black box mystery.'";
  end the story saying "Some things remain unexplained.";

After examining the Chinese Room:
 end the story saying "The satire is palpable. Machine translation, black boxes, explainability... AI sure has a long way to go.";

In John Searle’s famous thought experiment, The Chinese Room, he poses the following situation:

Imagine a native English speaker who knows no Chinese locked in a room full of boxes of Chinese symbols (a data base) together with a book of instructions for manipulating the symbols (the program). Imagine that people outside the room send in other Chinese symbols which, unknown to the person in the room, are questions in Chinese (the input). And imagine that by following the instructions in the program the man in the room is able to pass out Chinese symbols which are correct answers to the questions (the output). The program enables the person in the room to pass the Turing Test for understanding Chinese but he does not understand a word of Chinese.

In the land of LLMS, we’re all standing outside that Chinese Room, slipping in our little prompts and taking whatever comes out the other end. Except instead of producing perfect Chinese, the man in the box (or the LLM), sends out mostly valid Chinese content. The man is not answering questions so much as replying to the Chinese with amalgamations of Chinese characters that typically follow the ones we passed in. No one would say the man in the room understands Chinese. And, what’s more problematic, we, the ones passing notes in, may not understand Chinese either. Or, in the case of this example, computer source code.

When we ask LLMs to produce computer source code, we now have two Chinese Rooms. The one is the LLM that produces computer code when prompted. The other is the computer or compiler (if needed) that is processing the code produced by the LLM. The prompter, as it turns out, may be just as in the dark as the person in the room producing the code without comprehension. How could this go wrong?

It struck me that Searle’s thought experiment would make a fun interactive fiction piece and that it would be a further ironic twist if I asked an LLM to write the piece for me in a programming language that I have only a tenuous grip on: Inform 7.

If you haven’t encountered it, Inform 7 is (the 7th version of ) an English-like programming language created by Graham Nelson. The syntax reads more like English than perhaps any other programming language. It will take:

The Chinese Room is a room.

But it will not take

The Chinese Room is a big room.

That similarity to English makes it fairly easy to read, but deceptively easy to write. For no sooner does a novice programmer learn they can program in the vernacular that they try a familiar word only to get the reply, in the parlance of interactive fiction, “That is not a word the language can recognize.”

As my thoughts of this conundrum circled, I asked ChatGPT to write a work of interactive fiction about the Chinese Room, and to my surprise, it created Inform 7 code. Not valid Inform 7, but Inform 7.

Starter Questions:
What does this code generating experience tell us about naive (as in, from the stand point of not knowing the programming language) uses of ChatGPT or an other LLM to write code?
Are we in the Chinese Room situation or something else?
Whichever Room we are in, what are the implications for readers of code written via prompts.

sdibella · February 16

I see Searle's thought experiment to be about a problems in scope around where understanding in a system "occurs". If you look too closely, it's clear that individual parts (Searle, trapped in his own experiment!) don't understand by themselves but they're still an integral part of how understanding is performed by the system.

I think this ChatGPT take on the experiment reveals another problem in Searle's experiment, but one that isn't discussed by Searle himself much: the knowledge of who or whatever is evaluating the output of the Chinese Room. As we're all seeing with ChatGPT now—and has been a common problem with hyperbolic, historical claims about chatbot performance—it is much easier for computational system to produce a plausible output that looks right to someone unfamiliar with a language but that falls apart the moment it's put to real rigor. This isn't something that's unique to machine translation either—the deformations that speakers often make to loanwords taken from another language are a great example. The framing of Searle's experiment presupposes a knowing evaluator, but I think that to consider the social problems posed by LLMs, we have to look outside the room and cancel that presupposition.

markcmarino · February 17

Yes @sdibella excellent point. I suppose in our LLM analogy, we may be not only the ones passing messages into the person in the room, but also additional people in rooms passing messages out of the... building(?) to another receiver, the machine that processes Chinese, or in this case code. So, there may be two layers of non-understanding -- or ignorance, or perhaps more layers. The lack of understanding of the person inside the room (or LLM) that has a passing fluency without comprehension, then the variable understanding of the person who requested and then receives the code, and finally the machine that in some ways is 100 percent fluent in code and at the same time altogether ignorant of the meaning.

Does a wire know what electricity means? Do I know what food means when it enters my body?

Are there other signs of fluency, cleverness (accidental or not), or ignorance in the Inform 7 code?

Howdy, Stranger!

Categories

In this Discussion

LLM Writes The Chinese Room (Code Critique)

Comments