Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

In this Discussion

2026 Participants: Martin Bartelmus * David M. Berry * Alan Blackwell * Gregory Bringman * David Cao * Claire Carroll * Sean Cho Ayres * Hunmin Choi * Jongchan Choi * Lyr Colin * Dan Cox * Christina Cuneo * Orla Delaney * Adrian Demleitner * Pierre Depaz * Mehulkumar Desai * Ranjodh Singh Dhaliwal * Koundinya Dhulipalla * Kevin Driscoll * Iain Emsley * Michael Falk * Leonardo Flores * Jordan Freitas * Aide Violeta Fuentes Barron * Erika Fülöp * Tiffany Fung * Sarah Groff Hennigh-Palermo * Gregor Große-Bölting * Zachary Horton * Dennis Jerz * Joey Jones * Titaÿna Kauffmann * Haley Kinsler * Todd Millstein * Charu Maithani * Judy Malloy * Eon Meridian * Luis Navarro * Collier Nogues * Stefano Penge * Marta Perez-Campos * Arpita Rathod * Abby Rinaldi * Ari Schlesinger * Carly Schnitzler * Arthur Schwarz * Haerin Shin * Jongbeen Song * Harlin/Hayley Steele * Daniel Temkin * Zach Whalen * Zijian Xia * Waliya Yohanna * Zachary Mann
CCSWG 2026 is coordinated by Lyr Colin-Pacheco (USC), Jeremy Douglass (UCSB), and Mark C. Marino (USC). Sponsored by the Humanities and Critical Code Studies Lab (USC), the Transcriptions Lab (UCSB), and the Digital Arts and Humanities Commons (UCSB).

[Code Critique] getCrimeaStatusCookie, Yandex and very large codebases

Yandex Code Critique


Title: Yandex Maps
Author/s: Yandex Corporation
Language/s: JavaScript
Year/s of development: 2021
Software/hardware requirements (if applicable): Web


Code

/**
 * Возвращает куку, отвечающую за статус Крыма.
 *
 * @see https://st.yandex-team.ru/MAPSUI-720
 */
function getCrimeaStatusCookie(cookies: Record<string, string>): string | undefined {
    if (!cookies.yp) {
        return;
    }
    const values = yandexYCookie.parseYpCookie(cookies.yp);
    return values.cr && values.cr.value;
}

Context

In 2023, the source code of Yandex, the equivalent of Google in the Russophone internet, was leaked. Given the ties of the Yandex engineers with their western counterparts, and the ties of the Yandex management with the government of the Russian Federation, this is quite a unique corpus, as it inscribes both corporate and governmental power. It is also an incredible challenge to make sense of it.

I attempted to do that in a paper that was recently published createPoliticsResponse: the political computation of state borders in Yandex maps (edited by @orladelaney9 and @davidmberry). Most of the article is focused on tracing how Yandex Maps decides which borders to show, to whom, and under which conditions. In this sense, it is a material testimony of what is already asssumed, but hard to prove at the interface level, and in thus case it really shows the specific contribution of CCS to platform studies.

One code snippet that I looked at in the paper was the function above, from the frontend part of Yandex.Maps, which seems to extract a value about Crimea's status from a client cookie. So on one side, it is quite obvious that a Kremlin-linked Yandex wants to treat one of the most contested geopolical areas of Europe as an edge case. But on the other side I have found it particularly hard to show how exactly this is treated, and as which kind of edge case. This is a big limitation of the critically studying this function: it only allows us to study the reading of a value, and not its writing, hence telling only half of the story.

One reason for this is that the size of the Yandex codebase is orders of magnitude more vast than the usual snippets that constitute most of the corpus of CCS: the whole leaked codebase clocks in at upwards of 44Gb, and the maps module at more than 4Gb; both are mostly composed of plaintext files (see the resources section for a repo containing the maps section of Yandex's source code). This shift in quantity seems to me to be a shift in quality, and asks new questions for methods of CCS, some of which I've sketched out below.


Questions

  • Quite an uncritical start, but where is the value of the crimeaStatus cookie field set? How do we handle variable names changing as they get passed as arguments/references/assignments?

  • How does one go about reading 44Gb of code? The default means of search in textual software (matching patterns of characters) is heavily biased towards a syntactic approach, rather than a semantic approach. Could tools that focus on the structure of code (e.g. class relationships, function definitions and references, data structuring, argument passing) rather than on the surface of the code? If we do CCS on large corpora with only tools that enable such lexical analysis, rather than tools that do static structural analysis, what are we missing?

  • Is it enough to focus on the name of a function (e.g. getCrimeaStatusCookie) as an argument to critique the relationship between a private corporation and the (imperial) policy of a nation-state, without knowing exactly what the function does? In other words, what is the relationship between lexical choices and semantic choices as epistemic building blocks in a critical code study? Is there a critique of the data structuring that is independent of how data structures are called?

  • Thinking of structure, how much can/shold CCS draw on existing CS entities and denominations? I'm thinking here of design patterns, best practices, testing strategies, application architectures or language features? Specifically here, what kind of parts a CCS grammar could something like middlewares or localizations be (getCrimeaStatusCookie being both of these)?

  • The nature of leaked code seems to always imply a lack. In this case, there is missing documentation, specification, as well as all the ML components of Yandex. So how does one investigate incomplete code? How does one account for the part that is lacking, and how can one make extrapolations about it? What kind of forensic is this?


Resources

Comments

  • Hello, pleasure to meet you. My name is Brian Arechiga. I’m currently a 5th year at USC’s Phd English program. This project looks great. I am a bit of a newbie when it comes to reading code, so, I will have to dodge the technical questions you posed. However, I am interested in the last question you posed - regarding leaked code - since I am also going to be working with leaked code in my future research. In my own attempt to justify this practice, I have found some solace in literary theory. I find these texts analogous to studying an unfinished book. Jerome McGann’s works like The Textual Condition or Radiant Textuality provides a framework for doing this. He writes that

    Every text enters the world under determinate sociohistorical conditions, and while these conditions may and should be variously defined and imagined, they establish the horizon within which the life history of different texts can play themselves out. The law of change declares that these histories will exhibit a ceaseless process of textual development and mutation […] To study texts and textualities, then, we have to study these complex (and open-ended) histories of textual change and variance (The Textual Condition, 9)

    For instance, reading David Foster Wallace's unfinished novel, The Pale King, comes with the baggage of the 14 year gap from his last novel, Infinite Jest, along with the circumstances of his untimely death (and the manuscript's presence in the room where he passed). In other words, there is a precedent for making the cultural and textual condition of the work a part of the analysis. For example, we can think about the proprietary nature of the code as a part of its socio-historical context, and perhaps as an entry point to discuss other relevant themes. I am studying 4chan’s source code at the moment. I plan on arguing that the proprietary nature of the code highlights an important development in the commercialization of the site, since 4chan, itself, was built originally from open source code. In addition, there is much to be said about the cultural conditions that lead to the code leaking in the first place. Maybe the proprietary nature of the Yandex code could lend to discussion of international politics, capitalism, etc…

    Looking forward to reading more about this project. I am curious what others have to say on this issue as well.

Sign In or Register to comment.