It looks like you're new here. If you want to get involved, click one of these buttons!
Title: Neural network example in Keras
Author/s: Keras docs (not sure exactly who wrote it)
Language/s: Python
Year/s of development: 2019
Software/hardware requirements (if applicable): Anything ranging from just your computer to 64 GPUs
Below is a snippet of code showing how to set up a small single-input neural network that performs binary classification in Keras (a popular deep learning framework in Python):
# For a single-input model with 2 classes (binary classification):
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)
This example code comes from a guide found in the Keras docs. The chunk of code under the first comment creates a sequential model with two Dense layers and compiles it. The chunk of code under the second comment generates dummy training data and labels. Finally, the last line of code trains the model by passing in training data and labels into a function called “fit”.
I want to focus specifically on the seemingly simple “fit” function. Many other machine learning frameworks are similar in that after setting up the model architecture and training data, all you have to do is call the “fit” method to actually train the model and the framework will take care of the rest.
On one hand, this abstraction is really helpful. You don’t need to backpropagate millions of weights by hand. You don’t even need to know what backpropagation means. As long as you define a model architecture and feed in some training data and labels, you can train a model. This lowers the barrier of entry to training a neural network.
But on the other hand, we pay an environmental price for this abstraction that we don’t immediately internalize. The sample I posted here is a toy example. But imagine if we had a much larger network like BERT, which is a state-of-the-art neural network developed by Google and used for a variety of NLP (natural language processing) tasks. The smaller model (BERT base) has 110 million total parameters and the larger model (BERT large) has 340 million. A paper published last year by the University of Massachusetts Amherst detailing the energy usage of deep learning in NLP shows that “[...] training BERT on GPU is roughly equivalent to a trans-American flight” (see Table 3 on page 4 of the paper). The amount of computational resources needed to train BERT is enormous: “NVIDIA reports that they can train a BERT model in 3.3 days (79.2 hours) using 4 DGX-2H servers, totaling 64 Tesla V100 GPUs”.
There are even more abstractions that make training models easier but also further distances us from its actual physical impacts. Don’t own 64 GPUs? No problem. You can pay AWS (Amazon Web Services) or Google Cloud to use their computing resources. It is common practice now to train models “in the cloud”. Now there really is no immediate physical indicator of how much computing resources you’re using. It’s all just ~ in the cloud ~
Abstraction is a holy tenet of software engineering. In my computer science classes, I was taught to hide all unnecessary complexities behind clearly defined, easy-to-use APIs. I never considered that a side effect of all this abstraction was a sense of disembodiment. The idea that there are physical limitations is as obvious as saying “water is wet” and yet, something I rarely think about as a software engineer. I just always assume that I will have the computing resources I need.
Grappling with the disembodiment of abstraction and virtual space reminds me of a beautiful essay published by the MIT Press called “Making Kin with the Machines” (Disclaimer: this was published as a critique/response to a manifesto Joi Ito wrote while he was still at MIT. However, he is not an author of the essay). This paragraph sums up what I hope to learn and discuss through the lens of Indigenous programming:
“One of the challenges for Indigenous epistemology in the age of the virtual is to understand how the archipelago of websites, social media platforms, shared virtual environments, corporate data stores, multiplayer video games, smart devices, and intelligent machines that compose cyberspace is situated within, throughout and/or alongside the terrestrial spaces Indigenous peoples claim as their territory. In other words, how do we as Indigenous people reconcile the fully embodied experience of being on the land with the generally disembodied experience of virtual spaces? How do we come to understand this new territory, knit it into our existing understanding of our lives lived in real space, and claim it as our own?”
As someone who is currently living on stolen Duwamish lands, I am not the right person to answer these questions. So I would really love to learn from people who are or anyone could better expand upon these topics:
1. [From the paragraph above]: How do we as Indigenous people reconcile the fully embodied experience of being on the land with the generally disembodied experience of virtual spaces? How do we come to understand this new territory, knit it into our existing understanding of our lives lived in real space, and claim it as our own?
2. We are currently in a climate crisis that is only going to get worse. Not only is there an environmental impact from machine learning, but also huge impacts from data storage, networks, and the mining of minerals for devices to name a few. In the essay, the authors (who are Indigenous peoples themselves) state: “We undertake this project not to “diversify” the conversation. We do it because we believe that Indigenous epistemologies are much better at respectfully accommodating the non-human.” And I wholly agree. What are different Indigenous lenses we can look through to incorporate technology respectfully with the environment? What about other lenses?
Comments
It looks like François Chollet, along with a group of ~16 other contributors, contributing here:
https://github.com/keras-team/keras/blob/8a8ef43ffcf8d95d2880da073bbcee73e51aad48/docs/templates/getting-started/sequential-model-guide.md