It looks like you're new here. If you want to get involved, click one of these buttons!
Abstract: Code is an epistemic system predicated on the repression of state, but with the rise of global optimization and machine learning algorithms, code functions just as much to obscure knowledge as to reveal it. Code is constructed in response to two characteristics of the twentieth century episteme. First, knowledge is represented as a process. Second, this representation must be sufficient, such that its meaning is constituted by the representational form itself. In attempting to meet these requirements, process is separated into an essential part, code, and an inessential part, state. Although code has a relationship with state, in order to construct code as an epistemic object, state is limited and suppressed. This construction begins with the first formation of code in the 1940s and reaches its modern form in the structured programming movement of the later 1960s. But now, with the growing prominence of global optimization and machine learning algorithms, it is becoming apparent that state is vitally important, yet our tools for understanding it are inadequate. This inadequacy nevertheless serves certain interests, which make use of this unclarity to act irresponsibly while obscuring their ability to determine the behavior of their software.
Question: The bulk of the article is about the evolution of code and the way this evolution was motivated by the suppression of state, in order to make code a better epistemic object. It ends on a bit of a dismal note: with machine learning algorithms, we get a peculiar epistemic inversion, where the state of the machine (all the learned weights on the neural network, for example) holds pretty much all the interesting parts, but state is precisely what has been strategically ignored in order to develop the very powerful epistemic tool that is code. The big million dollar question: now that we've spent 70 years developing our epistemic tools in the exact opposite direction, can we make heads or tails of all of these weights? If we construe critical code studies very narrowly, this is not a critical code studies question—weights are not code, they're not written by humans, and they're not even really read by humans (yet!). But on the other hand, this thing that we do—look at a technical symbolic language and extract a more complete meaning than a compiler does, than many coders are aware of, etc.—this thing looks kinda similar to the problem of making sense of a trained system. Further, trying to read weights looks a lot like Mark Marino's original proposal for CCS: software criticism is great, but we can't ignore the source code -> LLM criticism or training data criticism is great, but we can't ignore the weights.
So my big question for the group is: what now? When the weights and other state data might be more important than the code, what is the role of reading code? Should we/could we expand code studies methods to look at the state of machine learning algorithms, the weights or other data that make a given algorithm work?
Comments
Maybe this is a newb question, but... wouldn't we get into trouble if we tried to include weights more in our reading? Like, those tend to be blackboxed, right? Seems like anyone who gets working knowledge of that stuff (at least for the big/interesting sites) has to sign a non-disclosure...
This has me thinking about chapter 8 in Andrew Kennis's book, Digital-Age Resistance... It's kinda wild how net neutrality has fallen in so many countries around the world.
In countries that still have net neutrality, algorithm weights seem to be used as a "soft" way to overcome (without overturning) net neutrality. I wonder what a social movement for open algorithms might look like?
Black boxing is definitely one of the issues to think about. When critical code studies was just getting started, there were a lot of similar concerns about what we do. The vast majority of end-user software is closed source. As time has gone on, seems like the issue has all but evaporated. I mean, yeah, nobody can do research using an unavailable object, but that's true everywhere. You have to use available objects and make inferences. In the case of code, a lot of internet and web code is open source, and this has definitely helped out. But also, I think a lot of our readings of code have ended up telling us a lot about code and coding in general, which then gives us insight into closed-source programs as well. One would imagine that trying to get some kind of insight from "reading" the weights and such of an ML model would be just as translatable.
So I'm fairly confident that this can be overcome. But it's still definitely a question that's worth addressing in this specific domain. There are some open-source already-trained models, for sure. But maybe they are so much worse to be different in kind, rather than just degree, from closed models like ChatGPT. There's also the issue that OpenAI is promoting an AI-as-service model, and this centralization of a closed model might make it disproportionately important.
But overall, my hunch is that there are plenty of interesting open models, especially given the seemingly more integral role that academic research is playing in the development of AI at this stage.
Considering that code is always essentially mediated, and the representation of state (I am assuming you mean state in terms of a sort of internalised finite-state-machine) was always already obscured by the action of a program in execution unless a debugger or some other mechanism to make state known was displayed, is this really any different for machine learning?
Perhaps we just lack the right kind of "debugger" to translate internal weights into human-readable state? There has been some work on relating word vectors to a representational form from which "state" in some sense can be read from the model and I suspect that many other new techniques will emerge (e.g. heat maps are already being deployed).
So I'm very interested in what you mean by epistemology, presuming that epistemology is linked to a knowing subject (Popper, of course, argues for epistemology without a knowing subject).
@davidmberry Yes, code and state are not really related differently for machine learning, at least not in terms of the way that code obscures state. What seems different to me is just the relative importance. The idea that code contains the entire process within its text was always pure fantasy—people have been pointing that out for decades. But I think pre-ML this fantasy has been very epistemically productive—it helped us understand and theorize about the task of computation. With ML, it's not that code is exactly _un_productive, but there's a huge anxiety about the computational behavior of things that are not code. Another way we might think about what's different is just in terms of what the epistemic actors want. Pre-ML, computer science moved to construct code as an epistemic object, and to abstract away as much state as possible. Post-ML, CS is deeply concerned about how to theorize what all these different weights and models are doing. Part of what this paper is doing is just explaining how we got to that conjuncture, how we ended up in a place where it seems like the epistemic tools we have aren't well matched to the problems we face.
The debugger analogy seems extremely useful. I guess I would want to expand that, asking something like: what would a debugger look like if programmers made things happen with debuggers more often than with code? Like, let's suppose that we had a better debugger that let us "read" the weights of a model, or the way those weights took affect during an action, or something like that. Well, if we can understand and access them, probably we might write as well as read? And what would this whole code-state business look like if we wrote more state than we wrote code? Seems to me like that distinction is likely to reconfigure itself, although other things are possible. I mean, how long until we have ML analyzing ML algorithms and producing a weighted network that takes on the learning task itself? At that point there might be no code at all.
Re: epistemology, I think I'm thinking about social subjects more than individual subjects. There is definitely something interesting there—the amount of time an individual programmer uses a debugger is likely completely disproportionate to the amount of time computer science as a whole has spent on debuggers.
This will sound like a very marginal comment, but thank you @ebuswell for this contribution. I am attracted to one term you use: this conjuncture, "how we ended up in a place where it seems like the epistemic tools we have aren't well matched to the problems we face."
It is indeed a very specific conjuncture we have arrived at with AI/ML (in Stuart Hall's sense, i.e. in a cultural studies sense). Perhaps readings of code need to always be conjunctural, i.e. related to a specific conjuncture of state, history, power, society at that moment. This moment seems to open up a crisis and an opportunity, as in S. Hall definition of conjuncture. How we respond to this crisis undoubtedly will have strong political implications.
I'm probably on the wrong path but I'm just trying to understand...
I think of models and weights as data. Data are not different from code (code is data for the interpreter). In logic programming it is difficult to tell what is code (rules) and what is data (fact). So, from this point of view, CCS should take models as an object of analysis.
But "state" makes me think to functional programming, in which running the program (possibly) do not change the final internal state; or, better said, programmers try to not use states properties so as to be sure that every run gives exactly the same result, like in mathematics. But while functional programming is a real paradigm; while pure functional languages exist, while some programmers always try to adopt a functional style, this clear separation is just an ideal, since we abstract from resources availability, network collision, bugs, hackers saying hello, and so on. We could ensure that internal state never changes, but external state does continuously and our programs live in the middle.
These constraints (in short: the world outside) are also the true reason why we are interested in writing code. Should we write code in heaven, without friction?
So the state is a limit-concept but a crucial one. It is true that programmers spend a lot of daytime to think of process separated from state; but at night they know that state is always a part of the game. Any experienced programmer has learnt to ask "what if..?" and tries to imagine every possible exceptions to throw.
Let's take it to the epistemology level. Code as a knowledge representation can't be thought of as something pure, immaterial, immutable, having its meanings all inside it. Neither science, nor literature are separated from (their) states and meaning of concepts are to be found in operations by which .the concepts are applied, in real situations.
So: yes, also from this point of view, CCS should try to take weights into account,
Am I wrong?
@ranjodh the relation between code/state (on a computer) and code/state (in a government) is definitely the sort of productive metaphor I would expect to come from you —btw a lot of this ends up looking at the address within code in a parallel but sympathetic way to your work; most of the places where state refuses to be banished from code are addresses, in the evolving meaning of that word. Anyway, I am intrigued by the prospect of code(law)/state, but also wary. There's usually things we can learn from the same/similar concepts in radically different domains, but I'm not entirely sure that "state" is even the same concept in both these places, and not just the same set of letters and sounds. However, I think code is, if not quite the same, a very similar concept, and this is part of the background that didn't quite make it to this article. The epistemology of code does not come into being just because we like processes or because processes are inherently easier to think about, but because our social relationships starting around the 1930s were increasingly enacted (and therefore conceived) as a system which was justified by procedure, rather than by abstract structure, personal relations, etc. That is, of all the many different ways that lawlike interpersonal relationships have been historically justified, we started doing it through a sort of last-instance procedurality, the answer will eventually be right because the system does the right things. This manifests in computer science as code, in linguistics as Chomskyism, etc. In computer science, I tend to think of state as a pure negation—everything that isn't code. I do think there's a relation between code as law and code in computation, but I'm not sure that I think state (government) has the same kind of negative relationship to law.
@Federica.Frabetti not at all marginal. I think this is absolutely a conjunctural moment. What that means for politics is a huge question. I think we have a tendency to think of moments of transition as less externally defined than more steady state moments. I'm neither sure that's correct, nor do I think it's necessarily incorrect. In the abstract, a dialectical theory of history has the content of a transition as no less determined than a static relationship. In this particular case, it really does feel like there's this gigantic empty epistemic space, and the establishment might accept all sorts of useful things we create to throw at it. Something like a weakened immune system. But at the same time, I think we'd better consider the structure of the transition itself in order to be effective, and this is something much harder to see, something which we don't have as many conceptual tools to deal with. Part of the reason for that is the way transitions do (or rather don't) figure in all the students and grandstudents of Althusser, which makes cultural thinkers like Hall that work outside of that paradigm a great place to start.
@Stefano This sort of thing always comes up when looking at ideology, since almost inevitably the author (me in this case) gets a little too loose about clarifying what is really happening vs what people think about what's happening, what I think is good for knowledge vs how knowledge is actually getting made, etc. So thank you for the question, which I think will help me articulate this. I think we're mostly on the same page.
First, yes, absolutely, there is no code without state. And in an Eckert/Von Neumann style architecture, code and data are interchangeable. Even in a modern Harvard style architecture, where code and data come from different buses, pieces of code refer to each other and when they do this, there is a certain interchangeability between code and state. So code might "be" data, and state might "be" data, and by the transitive property code and state would be the same. But this doesn't really reflect how we think about them, and definitely doesn't reflect how we build our programming languages.
This coexistence of code and state ends up being a big problem for computer science. On the one hand, we have this idea that there's this thing called code, and when you write out the code, you are specifying exactly what's going to happen. But on the other hand, any sufficiently complex code makes use of a temporally evolving state which, although of course it comes into being largely through the execution of that code, is not actually represented in that code. Not being represented, it is not part of the epistemology of code. So on the one hand, there's a pressure for code to represent more, so that we can more clearly reason about this mutating state. On the other hand, the less state that code represents, the more code-like (and the more abstract) code is able to be. That is, representing less state helps us reason about code as code. Both these forces end up shaping what code is. State is simultaneously incorporated and hidden. To make it clear, this whole paragraph is about how calculation in the abstract (if there is such a thing), is actually represented in our particular social world in order to give us knowledge about calculation. We don't have to think about it this way, and I think as you point out there's a sort of knack that programmers have that doesn't totally all the time think about it like this. But, then, the relationship between knackiness and knowledge is a whole fascinating can of worms.
Since you bring it up, "pure" functional programming languages are a great place to see this in action. On the one hand, there's an anxiety—mutable state is mucking up our ability to think about code as code, and if we could just get rid of it we could reason about code better. Solution: get rid of state, by getting rid of mutable data. Well, it turns out that actually the key place that state hides in code is in the flow of instructions, and functional languages hide and obscure this flow of control even more, without in any way removing it. Lexical relationships try to substitute for control flow relationships, but they never completely succeed in doing so.
@ranjodh I also read this as state and 'the state' - as a form of control that is in service of the state not necessarily the human on the citizen (been reading a lot of Bookchin recently) - and state especially as a matrix of weights that represent a neural network is a form of control - how is it a form of control that is different from an instruction set of more traditional (heuristic or decision tree) algorithms.
The thing that strikes me about state is that it is history - code becomes historical - perhaps this brings it into all kinds of Hegelian/Marxist frameworks ... But what I think of is Benjamin's Theses on History - there is a state after which everything will change, we have the model, the messiah comes.
From state to state the regime changes, perhaps the episteme changes - state is perhaps violence - or the origin of state -
@ebuswell I'm a bit of a newcomer to this group and this topic so please forgive any naïveté. My thinking about the topic you raise is to first clarify how I understand the terms code and state, and hopefully in the way you intend them. I hope I am addressing the topics you raise. At the very least, thank you for provoking my own thoughts.
Most (all?) industrial code is an attempt to model a human process or set of processes (e.g. depositing money; editing an email) with a set of formulas that are a kind of compromise between or synthesis of the following entities and relations:
The program begins as a set of natural language requirements or, even less formally, as cocktail napkin wishes, goes through an implementation that meets those requirements, and "ends up" as the system that executes them and the user manual describing them. There is a process of realization that, no matter how messy, is rightly lived as a challenge fulfilled.
The state that passes through these models are like the set of Real number values that can inhabit the variables of an equation. Except that the values taken on by the variables are more often than not (empty) symbols (enums) as stand-ins for 'balanced', 'happy', 'sad' etc. or whatever other states, now understood as the 'states of mind', i.e. socio-pyschological states that our culture 'provisions' us with and that make up our psycho-social reality and identity. The formulas of the program are the model of the particular actualization of the cultural process (withdrawing money, liking a post, sending an email).
Machine learning and LLMs dissolve the synthesis and compromises of the models and modeling into stochastic relationships between sub-atomic ngrams. These are the weights you refer to, yes? The weights are links of a vast network that embody at once the thoughtlessness of stupidity and the mechanicalness of the most formulaic truths and systems of proposition. The most obviously pernicious aspect of these weights are the biases (e.g. racism and sexism) they embody. Please let us also acknowledge that the 'art' it produces (at least what I've seen) is just plain vulgar and ugly. The weights would also include -- since sucked up from the web -- all forms of false consciousness. Actually, its medium of weights is probably false consciousness itself.
I find this proposal really fascinating. At the risk of understanding incorrectly, and as a provocation, I would like to challenge some of the premises to try to answer your question. First I would like to challenge the idea that the “code is predicated on state repression.” If I understand correctly, by state you mean a particular arrangement of elements within a specific moment in time, which, in terms of the code, would only exist during runtime. In this sense, I would argue that, although the state is concretized only during execution, it is exactly its prefiguration that gives structure, to a large extent, to the code. Thus, for example, beyond the entry point of any language, the execution time is always regulated by the different state machines that are built on the conditional or iterative structures of the program, and that are prefigured as possible scenarios within the code.
The second premise I want to challenge is the notion that “code is an epistemic system” and in this case I would go further and claim that it is an ontological system. In this sense, what I am trying to argue is that before even framing and defining knowledge, the code itself proposes an ontological relationship with time, space and objects. Two clear examples are runtime and class construction. In the case of the first, as I argued in the previous paragraph, at least in structured programming paradigms, time is represented as a consecution of events. It is from these events and state changes that we can understand the flow of the program and verify its execution. In the case of classes, clearly the very notion of object, property and method already point to a particular ontological interpretation that stabilizes the world in specific categories and that creates hierarchies and evolutionary lineages around these categories. Thus, the epistemological dimension of the code: the framing of knowledge, is nothing more than the effect of an ontological arrangement. And I would further assert that it is these ontological effects that have the most profound effect on the real world.
Finally, answering your question about the weights. I would say then that your question hits the nail on the head of the difference between two paradigms around the creation of artificial intelligence. The first sought to emulate higher cognitive functions through sufficient logical structures and objects for the machine to “understand the world,” and the second sought to emulate basic cognitive functions through adaptation processes. The change between both paradigms is not only epistemological, it is ontological. While in the first we could say that we see a Bergsonian time, which is built by the consecutions of events, in the second we see a time that is understood fluidly and that can be stopped at any step. On the other hand, it is an ontological change in that, at least in ideally, there is no attempt to define the world around predefined categories, but rather the world is understood as a flow of information from which the machine has to learn to infer categories by itself, and from which its generative capacity is derived. Even the difference between machine learning techniques that depends on labels, and new forms of deep learning where inferences are based on less structured information, is a profound ontological leap in this regard.
In this sense, contrary to what @Stefano proposes, I would not say that weights are data, since in fact they are the result of massive information processing, but I would not say either that weights are code. Instead, I would say that weights are snapshots of the state of a machine that is defined by its evolutionary adaptation to the information it receives, where the code does not seek to give structure to the machine, but it only defines the way in which information must be processed. In this sense, this notion of the code is closer to the metaphor of DNA as a code (in itself very probelmatic), since it defines the metabolic processes of the organism, but not the individual concretion at each moment in time. To get the metaphors out of the way. While AI is nothing more than a sophisticated and complex statistical equation, the code defines the terms of the equation and the weights are the numerical representation of the resulting statistical curve, according to the processed information at a given moment.
@leonardoaranda you are challenging more than these premises only . I'm very intrigued by your reflections.
If I understand you well, you are saying that:
1. there are data, which are all the information accessible to the machine (at a certain moment and under certain respect, I would like to add)
2. there is code, which is a definition of the way in which data should be processed
3. there is state (weights), which is a photography of the information being processed at a certain time.
But since weights are used to change the future of the process, they act as new data. In other words, they are data coming from inside and not from outside. They share with classic data the dependency on time (whilst the code, except for genetic algorithms, stay unchanged).
To come back to @ebuswell provocation, the epistemic status (sorry) of code should be that of staying unchanged during the evolution of the machine. Code has the intentionality, data hasn't; weights are the next generation of data (to speak in evolutionary terms) that are better suited to the environment. Code is form; data is matter and it's subject to time.
This wonderful history has always a crack, in my vision: this is what this society is narrating to us but sometimes we succeed in seeing through the crack. Epistemology is an (historically determined) way to construct objects, not to reveal them.
CCS is a good practice to help enlarge the crack and try to see which were the intents of coders, of coders' masters, their constraints, their machines, what was happening in the adjacent room and so on. They could imagine to give code a certain identity, a gender, a subjectivity, intents and dislikes (I do it all the time), and that is typical of our nature of humans telling stories. The meaning of "intentionality of the code" stay there, in those operations, as generally speaking concepts and meaning are not properties, but results of operations of someone somewhere and sometime.
Here I can hear @ebuswell saying again
Lots of great thoughts and provocations here—I'm not going to do justice to all of them, apologies.
@leonardoaranda I'm not entirely sure what you mean by "ontology." Do you mean that code is coming from a certain set of ontological commitments on the part of the authors and readers of code? I'm in agreement there—in fact I'd say that all epistemic stances intertwine with some kind of ontological stance, whether we believe in ontology or not. However, that doesn't necessarily mean that the best way to study an epistemological commitment is through the ontological stance that accompanies it. And we have to remember that logical priority (if ontology is even logically prior to epistemology) does not imply causal priority. In the case of computational time, I think this has less of an effect on the epistemic construction of code than other parameters, such as the ideal of representational completeness, or even the concept of an epistemic process itself.
I think this bit ended up getting cut from the article for space concerns, but in the 1945 report on the EDVAC, Eckert gives pretty clear reasons for the creation of the sequential paradigm for code: by spreading out a program in time, instead of space, you can reuse the very same vacuum tubes to do multiple different kinds of calculation. This was the only way to make computation cost effective, and the second generation computers used about 1/6 the number of tubes as the ENIAC. So the sequencing in code comes into being through a technoeconomic rather than an ideological concern. But the connection between the temporal sequencing in the machine and the lexical sequencing of code is something that only gets established through the course of history, and that's where the epistemic motives show up. E.g. some of the early machines and machine proposals had a next address field for every instruction. But as things developed, this temporal machine nature was abstracted such that it showed up in the code itself, as lexical sequence, then as conditional branch, block statements, etc. So in the present, I think absolutely there's a philosophical question about the ontology of computational time in the way that computer science thinks abstractly about that time. But in the actual development of code, computational time was really just time plus vacuum tubes, and this only became abstract much later, through the subordination of computational time to the epistemological requirements of code. Note, btw, that neither Church nor Turing have the same kind of time in their algorithmic processes.
As far as code being predicated on the repression of state, my argument is about epistemology, so it takes an abstractly logical form. But I'm not arguing that code has to repress state—this is historically contingent (contingent on the epistemic outlook which preexisted and shaped code, anyway). So I don't think any argument I give here will really be satisfactory. This empirical work (reading a bunch of programming languages) really makes up the majority of the article. However, I will say that I use the term "repress" very deliberately. State is everywhere intimately connected with code, and it is not possible to banish it. But decisions are continually made to shape code in such a way that the people who are writing code have to think about state less and less—hence "repression," which I think is pretty descriptive through we shouldn't get confused by the metaphor. People are still aware of state, just less functionally aware.
For a specific example of muddling of code and state, I wanted to bring up Chris Domas's ruductio ad absurdum. Reductio is a "Turing Complete" program whose behavior is entirely determined by the state of outside data when the program begins. Domas is a security researcher and this program was created as an experiment to evade algorithmic analysis. In effect, it can execute any algorithm a conventional computer can run where "the instructions executed by the processor become the same for every program." As he puts it, "the exact same code can actually do pretty much anything ... If the code that we write makes no difference, what does it mean to program something?"
@ebuswell This is a fantastic idea! I'm reminded of an article i read a while ago, which states that in ML, data always wins. I think it was related to computer vision, and it stated that there was a change in the field from hand-engineered features or filters into more data driven processes like AlexNet, where the convolutions are learned from training data. Unfortunately I don't remember what the title and author was, but I do think that it is somewhat of a common sentiment among ML engineers to orient towards less handcrafted architecture components but rather to use more quality and diversity of data. I wonder if this could be related to the idea that you propose where the state is so important currently in ML, while we somewhat have a lack of theoretical and practical developments in understanding and modifying it.
I propose this humorously, but maybe this isn't an opportunity to un-repress state in programming, but instead one to actually repress code too. Maybe one day we need to consider code nor state, but we would think of software and write software by its effects and outcomes, by requirements and expectations, and code and state becomes repressed in favor of interpretation or interface. Maybe developments like coding LLMs could be seen as a start of this leap.
@eleazhong maybe that's a "ha ha, only serious" kind of humorously ? I think this actually brings up a really important point: what's next isn't what's currently happening, whether or not we do anything. That is, we have this epistemic tool, code, that lets us understand part of the process, but kind of muddles this other part, state. But with ML there are really three parts, the third being the training data. Currently, the dominant method for both humanities and computer science to understand what goes right and wrong in ML processes is through examining the data set. For example, Why is xyz ML process racist? gets the answer Because the data's from a racist society. This is probably true enough, but data isn't an algorithm, not on its own. There's a kind of smokescreen of supposed inevitability here, and behind that smokescreen companies get away with a lot of BS. What worries me is that if we don't create epistemic tools to deal with ML, we will, as you say, end up losing both code and state.
In fact, this seems the most likely outcome. Actual optimization has problems: optimization algorithms are not effective independently of the characteristics of a data set (no free lunch); and optimization has no actual meaning apart from a particular objective function. But the enthusiasm of optimization thinks of optimization processes as universal, dependent only on the training data, computing resources, and environment, eventually inevitably self-organizing into perfection. If enthusiasm wins here, and it often does, this means that there's not actually any reason to understand either code or weights—these are just particular ciphers for a universal and singular process: optimization. One of our possible futures replaces our current, hard coded optimization algorithms with optimization algorithms that have been generated by other optimization algorithms, no code involved. Then, the only thing we would "see"—that is, the only thing we have epistemic tools to deal with—is data.