The Digital Closet (Alexander Monea) » p.12 » Global Archive Voiced Books Online Free

The Digital Closet, page 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

Few at the time recognized the gender bias that was being established at the foundation of tech culture. Levy described hackers as so obsessed with programming computers that they would ignore women. He wrote, “Not only an obsession and a lusty pleasure, hacking was a mission. You would hack, and you would live by the Hacker Ethic, and you knew that horribly inefficient and wasteful things like women burned too many cycles, occupied too much memory space.”76 One hacker that Levy quotes uncritically noted, “Women, even today, are considered grossly unpredictable. How can a hacker tolerate such an imperfect being?”77 Instead, hackers gendered computers and experienced them as their ideal women whose hardware and software could be directly interacted with at will, perfectly controlled, and intimately known. As Noam Cohen has explained, “If this all sounds sort of sexual—or like an old-fashioned marriage—well, you aren’t the first to notice.”78 As computer science pioneer John McCarthy noted, “What the user wants is a computer that he can have continuously at his beck and call for long periods of time.”79 The masculine generic in McCarthy’s statement is emblematic of a culture that did not forbid women from participating but made a point of not accommodating or welcoming them into the field, increasingly discouraging women from participating in a field they had dominated during its infancy. This effacement of women’s historic centrality to computation is deeply connected to the myth of meritocracy in Silicon Valley, as predominantly male, libertarian individualists continually perpetuate a narrative in which they arrive at fame and fortune without having had any special privileges or owing anything to anybody.80

This hacker ethic quickly cemented itself into what others have called “the Californian Ideology,” an aggressive libertarian and narcissistic understanding of society that masquerades under the façade of chill nerds who just like to build cool things.81 Whether this belief system is maintained in earnest by all programmers in the Valley is irrelevant. As scholars like Christian Fuchs and Nick Dyer-Witheford have pointed out, programmers are an increasingly precarious class because of their replaceability and are easily controlled by the corporate officers of their companies because of their desire to maintain the perks of their positions—prestige, high wages, utopic office spaces, and the ability to perform labor that they find meaningful.82 Thus, those programmers who might develop an interest in the purposes of their work or find themselves critical of the social impacts their research might have on the world are left with little room to voice these qualms. Instead, the ruling ideology is one in which “progress”—here understood as advancements in practical technologies—is inevitable, and all one can do is try to capitalize on being the first to meet the bleeding edge of the future. This ideology is established on a fundamental heteronormativity that genders and sexualizes the computer as the perfect object for the masculine gaze and control. The narcissistic hubris that it establishes leads men to believe that no one can see the future better than them, that no one ought to prevent them from realizing their ideas, that any idea will inevitably be made manifest, and that all one is responsible for is hacking together the best operational prototype possible from available resources and patching it as problems emerge in the future. This is precisely the worldview that we will see in Google’s development of SafeSearch and Facebook’s content moderation practices (Facebook has gone so far as making the address of its campus 1 Hacker Way, Menlo Park, California). Both companies hack together available resources without clear plans or solicitation of outside feedback or criticism. Both companies consider progress to be inevitable and work to be at its cutting edge. And both companies end up embedding heteronormative and sexist bias into the foundations of their platforms that, as we’ll see in the following chapters, can never be fully patched after the code has been hacked together.

In the next section, I’d like to turn to a closer examination of the datasets and algorithms behind the automation of content moderation online with a specific focus on Google SafeSearch. While the technical details in the section may be difficult and tedious to some, I think they are worth exploring in this level of detail for a number of reasons. First, if we want to make changes to the algorithms and datasets that shape large portions of the internet, we are going to need to be able to engage in discussions with computer scientists, and this necessitates working toward at least a basic command of their discourse. It is my hope that going into this level of detail and demonstrating at least a basic awareness of computer science discourse will help make my arguments more convincing to people at the levers of power. Second, I think that these analyses pay dividends, which readers will see if they persist through some of the denser paragraphs. I’ve done what I can to make things as clear and concise as possible, but the technical literature is dense and difficult to perfectly distill. That said, I’ve tried to distribute new and surprising findings throughout the extended case study that wouldn’t have been possible for me to unearth without diving into this technical literature. Readers can rest assured, though, that the following section of the chapter—and the remainder of the book for that matter—return to less technical issues, like the human labor of content moderation here and the impact that overbroad censorship has on LGBTQIA+ communities in chapters 3 and 4.

Google SafeSearch and the Cloud Vision API

The history of SafeSearch is nearly synonymous with the history of Google. At the turn of the millennium, Google was already more focused on obliging potential advertisers by censoring pornography from its search results than it was on Y2K. One of its earliest hires was Matt Cutts, who for nearly twenty years led the department at Google that fights spam and search engine optimizers to protect the integrity of Google’s search results, one of the most important positions at the company. Yet Cutts’s first job at the company was to develop SafeSearch. His first months at the company were spent crawling web porn looking for largely text-based classificatory signals that he could use to automate porn filtering and subsequently trying to recruit colleagues to search for porn that might have evaded his filter system.83 It is worth noting that from the beginning, Google has understood web pornography through the lens of spam. Just like spam, porn has no fixed definition and requires vigilant updates.84 For Google, porn is like a virus, constantly mutating in form and strategy to evade detection and infect the healthy body of search results.

In its earliest iterations, SafeSearch was focused on Boolean textual analysis almost exclusively. Cutts’s web crawlers would analyze the text that appeared on porn sites to aggregate a set of weighted “trigger words” that could indicate the likelihood that any given site was pornography. The viral understanding is evident here, as, for instance, slang and misspellings were considered to be motivated—i.e., deliberate attempts to evade the filter—and thus were programmed to weigh in as indicators of pornographic content.85 Google would then layer behavioral data from its users atop this textual data. By keeping track of what users actually clicked on when they were searching for pornography and how long they visited those links, Cutts was able to establish further patterns about what the context and content of websites were.86 This was particularly important because, at the time, it was impossible to parse the content of images or videos on the web. One could only simulate an understanding of any given image’s content through an analysis of the textual content it was embedded in and behavioral data on how users interacted with it. Analyses of images thus began with looking at the text and user behavior attached to them, and only later would these analyses become sophisticated enough to examine the pixel values of the images themselves.

The visual analysis of images pixel by pixel only started to pick up steam in 2008 as graphics processing units (GPUs) became cheaper and more powerful. The first iterations would index the RGB values of millions of images such that any given image could be correlated with nearly identical versions online. This was the origin of the broader capacity we all enjoy today of using an image as a search query on Google, an innovation brought about by Google’s focus on porn censorship.87 Shortly thereafter (c. 2012), Google began exploring the use of machine learning to train neural networks to detect pornographic content and developed what in April of 2016 it would make available to the public as its Cloud Vision API.88 As Google explains it, “Google Cloud Vision API [application programming interface] enables developers to understand the content of an image by encapsulating powerful machine learning models in an easy to use REST API” (emphasis mine).89 Cloud Vision’s features include not only “Explicit Content Detection” but also “Label Detection,” “Web Detection,” “Face Detection,” “Logo Detection,” “Landmark Detection,” “Image Attributes,” and “Optical Character Recognition.”90 In my experience working with Cloud Vision, a number of these features remain severely limited, but the API’s capacity to detect explicit content is uncannily accurate—provided we understand explicit content as being any and all nudity and that we understand nudity as female-presenting nipples and breasts, genitals, and (sometimes) buttocks.

Google actually has a much larger definition of explicitness that it has programmed into its Cloud Vision API. Images may be considered explicit based on their participation in any of five separate categories: (1) adult, (2) medical, (3) spoof, (4) violent, and (5) “racy” images can all be detected and blocked. Adult images may contain elements such as nudity, pornographic images or cartoons, or sexual activities.91 The category is meant to focus solely on “explicit” or “pornographic” nudity, especially those images that focus on “strategic” parts of the anatomy. However, the system is trained to avoid flagging as adult content any medical, scientific, educational, or artistic nudity, as well as “racy” images that cover said “strategic” parts. Medical content consists of “explicit images of surgery, diseases, or body parts,” and its classifier primarily searches for “graphic photographs of open wounds, genital close-ups, and egregious disease symptoms.” Spoof content primarily looks for memes, which are indicated by the presence of text (often at the top and bottom of images) and typical meme faces, images, and backgrounds. Violent content consists of images flagged as depicting killing, shooting, or blood and gore.92

The fifth category was added only after launch and remains in a sort of beta state despite being available to any developers using Google’s Cloud Vision API. “Racy” image detection is meant to capture all the content that escapes the adult content filter but might still be risqué enough to be worth censoring. In perhaps the only extant definition of what this content consists of, Google writes, “Racy content includes lewd or provocative poses, sheer or see-through clothing, closeups of sensitive regions, and more.”93 It appears to be most often triggered by images of nudity wherein “strategic” parts are just barely obscured or covered. This is perhaps Google’s most nebulous classifier and demonstrates their orientation toward pornography as a virus needing eradication. In this metaphor, the broadness of the classifier indicates that it is more important to eradicate any viral pathogens than it is to preserve benign organisms. In more practical terms, blocking porn is more important than not blocking nonporn, including art. Take the Venus de Milo, for example. When I ran a Cloud Vision analysis of a standard Wikimedia Commons image of the statue—and keep in mind this is an image Google has certainly indexed, including its surrounding content and context—the API is convinced that it is likely a “racy” image (see figure 2.2).94

Before moving on to examine some examples of heteronormative biases that are hardcoded into the datasets that these algorithms are trained on, it is worth outlining some rudimentary results that I obtained by running sets of images through the Cloud Vision API to get a sense of how these sorts of heteronormative biases inflect content moderation on Google’s platform. I did a simple Google Image Search for “female breasts” and gathered the first one hundred relevant images—including a large number of pictures of fully clothed women, medical images and diagrams, and artistic renderings—and ran them through Cloud Vision. Of these images, exactly half of them were determined to “very likely” be “racy” images and thus would be censored in many instances through SafeSearch and in apps developed with the Cloud Vision API. Google SafeSearch seems to have learned the shape and texture of the average female-presenting—and lighter-skinned—breast. This was confirmed by running images of “nude paintings,” “nude sculptures,” and “hentai” (Japanese-styled nude and sexual drawings) through the system, all of which were frequently flagged as “racy” content when they contained any semblance of a female-presenting breast, again, even when clothed. Needless to say, this result was not repeated when I ran images of bare male-presenting chests through the system.

Figure 2.2

Venus de Milo being run through Google’s Cloud Vision API.

This betrays a particularly American, heteronormative interpretation of what breast tissue is and what it means. It exacerbates a sexualization of women’s and female-presenting bodies that has been a problem for internet users with what platforms deem “female breasts” for decades. For instance, Tarleton Gillespie has excellently documented the decade-long struggle that people have faced in trying to post images of their breastfeeding online.95 This problem is hardcoded into the datasets that algorithms like these are trained on, in the first instance by the decision to assume stable gender binaries. These assumptions have been productively challenged by trans women like Courtney Demone, whose #DoIHaveBoobsNow? campaign on Instagram showcased topless photos at different phases of her hormone therapy to beg the question of when her breasts became a content violation.96 This sexism in the dataset allows for breasts that are coded as “female” to be associated with “pornography,” “adult content,” or “raciness,” thus capturing and reinforcing a culturally singular cisnormative and heteronormative bias. It would take a team of much more capable researchers than me to fully catalogue the results of many of the sexual and gender biases in these datasets. While it is beyond the purview of this book to give a full demonstration of all their impacts, I will now turn to tracing some of the other biased sexual concepts that get captured and reinforced in both the primary datasets that image recognition and computer vision algorithms are trained on and tested against.

At this point, we need to take a detour through how a computer vision algorithm learns to detect adult content so that we can later understand how and where heteronormative biases can be hardcoded into the system. Many machine learning applications require a large dataset with consistent metadata from which they can then analyze and learn patterns to identify and classify new data. In the case of computer vision, this means that large repositories of images must be consistently tagged with appropriate metadata before any algorithms can learn to identify and classify new images. Since 2012, ImageNet has been the gold standard image dataset for training computer vision algorithms. ImageNet began as a conference poster presentation by Princeton University researchers in 2009.97 By 2010, it already contained nearly fifteen million labeled images.98 In that year, ImageNet also launched the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where computer scientists used a specified subset of the images as seed images to train algorithms to automatically identify and classify images not used in the seed set—you use half the dataset to train your algorithm and then test it on the other half of the images that it hasn’t yet analyzed.99 As we’ll see shortly, it was in response to the ILSVRC that the first major breakthrough in the use of convolutional neural networks for computer vision was achieved, and this breakthrough serves as the bedrock for many of Google’s computer vision applications today. Further, Google’s Inception architecture and GoogLeNet algorithm were developed atop ImageNet in 2014, later serving as the foundations for Google Photos and the Cloud Vision API.100

As noted above, each image in ImageNet needs to be consistently labeled with metadata. The metadata that each of these images can be labeled with, and thus the entire structure of the dataset, is extracted from WordNet, “a large lexical database of English.”101 WordNet also originated at Princeton in 1985 with funding by US Office of Naval Research, the National Science Foundation, the Defense Advanced Research Projects Agency, and the Disruptive Technology Office (formerly the Advanced Research and Development Activity). The goal of WordNet is to capture all of the distinct concepts in the English language and their interrelations. It does this by collecting all English nouns, verbs, adjectives, and adverbs and grouping them into sets of cognitive synonyms that it refers to as “synsets.” As its site notes, “Synsets are interlinked by means of conceptual-semantic and lexical relations.”102 Take, for example, the WordNet entry for “sex”: WordNet’s understanding of sex is composed of four noun synsets, one for “noun.act” that looks at sex as an action and contains “sexual activity” and “sexual practice,” one for “noun.group” that looks at anatomical sex, one for “noun.feeling” that looks at sex as an urge, and one for “noun.attribute” that looks at gender and sexuality.103 The noun.act synset for sex is embedded within the parent synset for a “noun.process” composed of the terms “bodily process,” “body process,” “body function,” and “activity,” which themselves are contained within the parent synset “organic process” and “biological process.” This latter synset is contained within the “noun.Tops” parent synset of “process” and “physical process” described as “a sustained phenomenon or one marked by gradual changes through a series of states.”104 It is embedded within two more generic noun.Tops synsets, the first being “physical entity,” which describes “an entity that has physical existence” and “entity,” which describes “that which is perceived or known or inferred to have its own distinct existence (living or nonliving).”105 In short, WordNet provides the ontology for ImageNet, determining what can exist and how it can be related—with relations existing between parent, child, and sibling concepts.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

The Digital Closet, page 12

Other author's books: