Tracing “Toxicity” Through Code: Towards a Method of Explainability and Interpretability in Software

davidmberry · February 2024

Abstract: The ubiquity of digital technologies in citizen’s lives marks a major qualitative shift where automated decisions taken by algorithms deeply affect the lived experience of ordinary people. But this is not just an action-oriented change as computational systems can also introduce epistemological transformations in the constitution of concepts and ideas. However, a lack of public understanding of how algorithms work also makes them a source of distrust, especially concerning the way in which they can be used to create frames or channels for social and individual behaviour. This public concern has been magnified by election hacking, social media disinformation, data extractivism, and a sense that Silicon Valley companies are out of control. The wide adoption of algorithms into so many aspects of peoples’ lives, often without public debate, has meant that increasingly algorithms are seen as mysterious and opaque, when they are not seen as inequitable or biased. Up until recently it has been difficult to challenge algorithms or to question their functioning, especially with wide acceptance that software’s inner workings were incomprehensible, proprietary or secret (cf. open source). Asking why an algorithm did what it did often was not thought particularly interesting outside of a strictly programming context. This meant that there has been a widening explanatory gap in relation to understanding algorithms and their effect on peoples’ lived experiences. This paper argues that Critical Code Studies offers a novel field for developing theoretical and code-epistemological practices to reflect on the explanatory deficit in modern societies from a reliance on information technologies. The challenge of new forms of social obscurity from the implementation of technical systems is heightened by the example of machine learning systems that have emerged in the past decade. A key methodological contribution of this paper is to show how concept formation, in this case of the notion of “toxicity,” can be traced through key categories and classifications deployed in code structures (e.g. modularity and layering software) but also how these classifications can appear more stable than they actually are by the tendency of software layers to obscure even as they reveal. How a concept such as “toxicity” can be constituted through code and discourse and then used unproblematically is revealing in relation to both its technical deployment but also for a possible computational sociology of knowledge. By developing a broadened notion of explainability, this paper argues that critical code studies can make important theoretical, code-epistemological and methodological contributions to digital humanities, computer science and related disciplines.

Read the paper here

Questions:

This paper sets out a new research programme in digital humanities and critical code studies related to exploring concept formation and how rather than purely discursive, extra-discursive prescriptive elements from software (and para-code) can be mobilised to stabilise it. What other examples of similar approaches are seen in the literature or in technical work?
Tracing as a method in critical code studies seems to me to have great potential for traversing the code-domain and social (the socio-code) in order to understand the importance of the interrelation of the two. What other tools can we bring to examining the socio-code for this type of code/concept analysis that moves between social and algorithmic levels of description? (Here I am thinking of techniques such as Entity Relationship Diagrams, UML, and other techniques).
One of the problems with this method is that it requires the management of multiple levels of discourse and code and their interdependence. As this was a relatively scoped analysis the difficulty of juggling the levels was relatively straightforward, but it would be good to be able to use some form of digital asset management, code management tool, IDE or other technique to assist with the process. Perhaps something like NVivo might assist with this? Do people have experience of managing multiple, inter-layered and inter-textual document analysis that might help with this?
The Concept Lab https://concept-lab.lib.cam.ac.uk (University of Cambridge) studied the architectures of conceptual forms in discourse. Are people familiar with this approach to concept mapping and how can approaches such of these be incorporated into critical code studies?
This approach of code/concept critique are clearly much more easily undertaken in open source work, but what are the methods we can use in proprietary systems?
Does this approach result in a tendency towards a pragmatic mode of analysis within a specific problem situation, and which may just reflect an instrumental approach to software rather than genuine insight? How does one connect these case studies to more generalisable situations, perhaps to questions of common sense assumptions and/or embedded values and norms?
There are obvious links to questions raised by the notions of explainability and interpretability in software, AI and automated decision systems. Does critical code studies have implications for the debates over explainability vs interpretability and what might the assumptions built into explainability mean for critical code studies?

Federica.Frabetti · February 2024

This is response to your Points 1 and 2. Unfortunately, I have never gone as far as developing the critique of code you develop in your paper, which I think is exemplary.
However, if I have understood your paper correctly (and @davidmberry please correct me if I am wrong!), I agree with your argument that to understand AI/ML both interpretability and explainability need to be considered.
In more ‘classical’ CCS, the focus was on the interpretation of code not in a functional way (as a programmer would understand it) but in a socio-cultural way (closer to the humanities). That at least was the premise I operated upon. AI/ML seems to have destabilised this distinction because a big part of understanding AI/ML systems demand an understanding of why/how it does what it does. Explaining this is not separable from the discussion (and the challenging) of the social and political aspects of code (most importantly, its social interventions and the harms it causes).
I also share your wariness toward XAI, as XAI is just another layer of mediation; and although our access to code is always unavoidably mediated, XAI can easily provide another level of (machine-generated) opacity.
Finally, I would also emphasize that, perhaps more than ever before, it is important to have an expanded notion of AI/ML systems, since they do function in a networked way. Development and deployment (the latter in particular) cannot be disregarded when understanding code. They are part and parcel of how code works and of its social functioning. I might be wrong but the examples I have analysed recently were impossible to read (proprietary algorithms) and yet the observation of the observable (in my case, some technical documents in the form of patents acquired by the developer, which were not intended to make the system explainable but simply to acquire its ownership, plus some deployment scenarios) sheds light both on its functioning and on its politics. In other words, the social and the technical seem to be less separable than ever in ML.
For example, in the case of the predictive policing system developed by Dataminr, a patent submitted by the company in 2016, and ACLU documents released between 2016 and 2022, as well as blog/promotional materials published by Dataminr itself, functioned as explanatory texts by proxy (what you would name paratextual elements, as well as elements to be found upstream -or downstream – from code?)
https://www.aclu.org/issues/racial-justice/protectblackdissent-campaign-end-surveillance-black-activists, and
https://patentimages.storage.googleapis.com/8b/11/5f/685bb0f2d18b2f/US9323826.pdf. I could not go beyond this.
This is not to say that close reading of code is not to be attempted, as well as the analysis of weights, the analysis of labelling of datasets, etc. Quite the contrary. It just goes to show how inextricable explanation and interpretation have become in ML. So I think it reinforces the premises of your paper and what you say in this post.
If I try to derive from your paper a notion of “toxicity” that works for this example, I think it is to be found in the way it performs when it is trained for and deployed by law enforcement. The developer refuses to demo the system to clients unless there is a basic contract in place (by which the client commits to using the system for a year if it does what it they require it to do). A basically trained ML is then trained for that client, allowing the client to write their biases into the ML, so to speak. This process can well last a year so that training merges into deployment through a process of prompt engineering, data gathering via “live” systems (sensors, body cams, CCTVs, IoT, social media posts and other sources we do not know about), reading of system output as acceptable/actionable and actioning of those outputs, and feeding back into the system the results of the action. This does not mean that the client knows how the ML works, but they acquire a basic working knowledge of how to interact with its ultimate output (say, a series of alerts that require dispatching the fire brigade or the police to certain locations). The toxicity of code is embedded at every level of this process. This could be a method to get “as close as possible” to reading code when close reading is impossible. Perhaps we should develop notions of “closer reading” and have different degrees of “closeness” of reading.
I think that your analysis of TTMC in this paper is super important because it shows how to detect the process of construction of “toxicity” in specific (open) source code. For me, it unpacks the idea that “bias” is identifiable, it can be detected and extracted. On the contrary, it shows how an entirely contingent conceptual formation of what is deemed “toxic” is “hardened” by code into something that can be treated as factual and used to reinforce technical solutionism (which is how the industry operates).

pierre_d_ · February 2024

I found this approach of going beyond "bite-sized" code, into larger, more operational code-bases and products super interesting!

In general, one issue I've had in my work was to disentangle what is the software engineer's responsibility, and what is the product manager's responsibility, and how we can find traces of one in the other's work. The problem I've found is that such influences are rarely unidirectional, as they mutually inform each other in usually subtle ways (my experience of programming applications for artists often results in a cobbling up of artistic desires and technical limitations). As implementation occurs, the ideal concept is always altered.

To your second point, I really enjoyed this paper on the concept of the human and the object in The Sims (link). Without having access to the source code (which has been made available since), the author combines comparative studies, discourse analysis and technical concept mapping to reverse engineer the ontology of the individual in that game. I'd be curious to see what a code study of the Sims would have to complement such a research.

Thinking of reverse-engineering, maybe reverse-conceptualization might be an interesting way to think about it? using some of the same technique to highlight the what of a system, rather than just its how?

As for tools, I've heard of people using Open Semantic Search to dig through code bases. I personally find that an IDE such as VS Code can be incredibly helpful, but can also be hindered by the sheer size of some codebases, so it might be interesting to think about different kinds of tools operating at different kinds of levels? But I'm still unsure where and how one establishes points of relation/articulation between those different layers.

Re: the concept lab, I heard of this a lot in the context of computational social sciences/political sciences to map out the landscape of public discourse. In this field, they're all super fans of graphs! But I wonder how well such a technique can apply to a code base, which much more non-linear execution and unintuitive connections between routines and sub routines. As per above, it's probably an interesting tool to make a first pass, or high-level map of how files/modules/packages can be interconnected within the same software. But in terms of concept mapping, it seems to me a lot more important to find out what are the classes/type definitions/data modelling parts of the code in order to then deduce such a graph, rather than, say, just a backtrace of function calls.

And for #6, it would be super interesting to develop a taxonomy/classification of things that are given particular effect or prescience through their implementation as algorithms, with a particular degree of "intensity" from "simple quantitative automation" to "definite qualitative leap"; in a sense, a bit like an "art history" or "reference book" of a mechanology of software, like Bernard Rieder's work.

My own question about proprietary/open source is really about scale. How do we even begin to audit systems that might be gigabytes of text file? What's the practical methodology for this?

jeremydouglass · February 2024

@davidmberry thanks for these great questions regarding your article on tracing a concept through its algorithmic circulations. In the context of this working group, I was struck by part of #2:

What other tools can we bring to examining the socio-code for this type of code/concept analysis that moves between social and algorithmic levels of description? (Here I am thinking of techniques such as Entity Relationship Diagrams, UML, and other techniques).

I would be curious to learn what you think about the code critique from @samgoree on What makes an image AI-generated? Patent diagrams for a computational Bokeh effect (code critique) and its use of patent diagrams. How might this kind of diagrammatic tracing of algorithms in a common cultural form like the patent either fit in with or contrast with your thoughts about tracing using e.g. ERDs or UML? Is this also part of a family of approaches that might fit with your question #5 about "methods we can use in proprietary systems"?

davidmberry · March 2024

@jeremydouglass these are interesting questions and I'll have a think about it.

One point that strikes me immediately is that patent diagrams are often designed to be subtly "wrong" to prevent competitors getting a leg up on the innovation (so called strategic obfuscation).

Although not illegal, most patents avoid detailing the core invention — what really is the improvement over the prior art. Likewise, some patent attorneys will have walked the tight-line between the patentability disclosure requirement and client-interest in keeping certain trade secrets. https://patentlyo.com/patent/2021/05/obfuscation-and-patenting.html

I suspect that this would also be the case here, and it would be quite an amazing code critique If such a subtle misdirection technique in the diagrams was detected

davidmberry · March 2024

@pierre_d_ thanks for you comments

My own question about proprietary/open source is really about scale. How do we even begin to audit systems that might be gigabytes of text file? What's the practical methodology for this?

I suspect that (as documented elsewhere on this forum) there will be a growing use of LLMs to map this text so that it can be negotiated like a maze via text prompts or some form of simplified explainability-like visual interface. See https://wg.criticalcodestudies.com/index.php?p=/discussion/157/ai-and-critical-code-studies-main-thread#latest

The restriction of course in the context limit of 32,768 tokens (GPT-4). Anthropic has announced a 100k context window (around 75K words) https://anthropic.com/news/100k-context-windows and there are rumours that GPT5 will have a 128k context window.

Still not quite 1 gigabyte, but perhaps some form of sharding of the dataset will be developed so it could be distributed somehow over the GPTs?

One other option, is to train Mistral or similar on the code-base? Although that would be a costly slow way to do a code critique and probably wouldn't necessarily help with this problem.

pierre_d_ · March 2024

@davidmberry

Hmm, yes the LLM approach seems to be the most convenient one, but I immediately wonder about the methodological/epistemological questions that arise from such a use. I guess that including it as a mixed-method approach would be the most reliable for now.

Have you seen examples of LLMs trained on very specific corpora in an academic context? Or is it mostly about fine-tuning exisiting models?

Howdy, Stranger!

Categories

In this Discussion

Tracing “Toxicity” Through Code: Towards a Method of Explainability and Interpretability in Software

Comments