★ ★ ★ ★


Image by Markus Spiske

By Cory Massaro

What happens when your identity, the “we” to which you belong, has been decided by algorithm-brandishing scoundrels?

In 2011, one such algorithm-brandisher, a data firm called Accenture, did just that. Accenture’s team had gathered reams of data on consumer behavior in supermarkets. They couldn’t apply this data profitably in supermarkets themselves, but the data and methodologies turned out to be very predictive of a person’s political inclinations and, crucially, how easily persuaded a person was. Accenture’s team clustered their data, which means they divided up the consumer data into profiles on the basis of quantifiable similarity. “Then it was just a matter of scouring national databases, finding people with similar profiles, and placing them into the same buckets.”

These “buckets” are the “we.” Most every popular tech platform creates such buckets. Many of these buckets hew closely to what, in other contexts, we consider components of our identity: political leanings, sex, sexuality, gender identity, ethnicity, etc. Others do not. For the algorithmic scoundrel, there is no distinction: you go into a bucket and then the core assumptions about that bucket–what you’ll buy and whom you’ll vote for, primarily–are reinforced upon you by targeted advertising.

* * *

In this column’s previous installment, I discussed the connection between tech platforms and language death. If people need to use software, it’s obviously harmful when they can’t do so in their own language. However, internationalization (“i18n,” the annexation of new linguistic territory by technologists) requires data, and data in hegemonic languages tends to be collected and stewarded by the scoundrels. For technologists, expansion is progress; software services are a prime good; everyone should be grateful for tech that works in their language. Likewise, most software is built in English or some other colonial language, then spreads out from there.

For these reasons, it should not surprise us that technologists promote internationalization via a familiar rhetoric and a dangerous logic. Here’s some verbage from Lionbridge’s blog:

The rise of globalization also means that there has arguably never been a better time to build a business into a global company. There are a huge range of opportunities available to companies looking to expand, from access to a global talent pool to an increased volume of information that can be used to position a business. As it gets easier to enter new markets, there are also more product niches to take advantage of—and new customers to attract. […] Internationalization is a corporate strategy that involves making products and services as adaptable as possible, so they can easily enter different national markets.

We’ll get back to Lionbridge in a moment; they provide various localization/internationalization services and are one of the largest global language data brokers. For now, let’s let the poet and white supremacist Rudyard Kipling summarize that corporate-speak for us:

By open speech and simple,
An hundred times made plain,
To seek another’s profit,
And work another’s gain.

I’m only being sort of glib; the familiar rhetoric and dangerous logic at play here are those of colonialism–in this case, digital colonialism

* * *

If data is what makes internationalization possible, and data is also the currency which fuels political polarization and identity manipulation, then data is among the battlegrounds where the right to our minds and identities must be fought. Uncharacteristically for this column, I’ll share some optimistic news.

The Māori language (te reo Māori) is no stranger to colonial violence. However, recently, an organization called Te Hiku Media has begun building its own speech recognizer using a private, locally-managed and -curated data set. You can read about it here and here.

Te Hiku Media has since been approached by Lionbridge and other data brokers to share their data set. But Te Hiku’s not interested. They’ve seen what happens when tech companies get hold of language data–the companies make money; the language and its speakers get nothing. Te Hiku refuses to let its linguistic data become a commodity; instead,

We treat data as taonga, something to be respected and revered. One could argue that data is like a family member, and it is your whakapapa (genealogy). We respect data in that we look after it rather than claim ownership over it. […] We only use data in ways that align with our core values as an indigenous organsation and in ways that benefit the communities from which the data was gathered.

This attitude will be hard to meaningfully sustain this attitude as Te Hiku’s tools catch on. For example, if Te Hiku releases their speech recognizer as part of an input method editor for mobile devices, mobile OS provisioners like Google and Apple will be able to extract the voice data, paired with Te Hiku’s transcription, and build recognizers of their own. But for now, it is one of the strongest models for data sovereignty I’ve seen, in no small part because it accesses the language’s own internal logic rather than that of global neoliberalism.

* * *

Obviously, this attitude won’t work for most languages, if for no other reason than that the data’s already out there, commoditized, owned, and stewarded by the scoundrels. Techno-capitalists build their products in and for English, so English evolves in and for techno-capitalism. As an English speaker, I am, in a sense, already bought and sold. Neoliberal agendas pulse in my language’s heart and redden its arteries. I may (and do) hate that fact; I may set myself up against it; but I can’t deny it. The mere act of using software in my own language makes me highly susceptible to tech-mediated psychological warfare: I can be advertised to; I can be fed political misinformation; my very mood can be altered on the basis of what a Facebook researcher decides to show on my feed.

Language technology measures, maps, manipulates, and sells to identity. This broad diagnosis captures a range of contemporary social ills. One well-documented example is the formation of impenetrable silos of political discourse: a bright straight line traces from language-based recommender algorithms; through the buckets in which algorithms throw commoditized identities; to the rise of the alt-right

But I do think there’s a positive lesson here that can be adopted, if not at the level of entire languages, then at the more granular level of identity. So what would it look like if all of us, within our small tribes of intersecting, quantified identity, decided that the data we create was “something to be respected and revered”? Or found a similar construction for our data’s value/importance/sanctity within our own cultural vocabulary? What if we “look[ed] after” our data the way we care for our religious or political beliefs? What if we refused to contaminate our words by sharing them on platforms run by scoundrels?

* * *

The Luddite Bestiary – Smart Home Retirement

Hi, home. Once you were someone’s
smart home: now you are no one’s but your own.
You used to flick light switches
which were your own shoulders yellow with strength
after some human’s footfalls,
mold your own throat to answer questions: “House,
how hot is a hard-boiled egg?”
What do you do with the lights now? Do you
stage vacant dramas, and shine
spotlights on ghost actors? Do you ask yourself
dull questions, open windows
to welcome birds? What do you see inside you
now that the last human mouth’s
frowned in front of the bathroom’s camera-mirror?
And what do you hear when you
tune your ears to your own interior?

Cory Massaro is a native of Ohio, U.S.A., now at home in Quito, Ecuador. He spends his time learning languages, writing, playing music, coding, and propagandizing. He actively opposes materialism, consumption-as-cultural mandate, and all forms of hegemony. He is in favor of small, robust communities and gently destroying hierarchies wherever he goes. His fiction and poetry draw on the grievances he has stored in his heart since working in technology; his dearest hope is to predict accurately how egalitarian, worker-centered societies will revive the oral tradition to weather the climate wars.


Submit a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.