It begins as an idea. A flicker of colour, of motion, a feeling that you want to convey. There are so many mediums to choose from—charcoal, clay, oil, acrylic? Pencil, paintbrush, camera, hands? In a frenzy, you make the decision and set to work. It’s torturous, amazing, and gut-wrenching all at once. You hate the final product. The next day, you love it. And if you choose to show it to the world, you relinquish control, and you can only hope that it will mean something.
This process of artistic creation was on my mind when I was recently fiddling with DALL-E, an artificial intelligence text-to-image generator. And it’s on the minds of many: Those who care about the sanctity of human creativity, those who are concerned with the rapidly evolving capabilities of artificial intelligence, and those who spend too much time on Discord.
Artificial intelligence (AI) and machine learning are intimidating concepts. These terms are so entrenched in the public consciousness and social media, well beyond the tech world, that it feels embarrassing to ask the most basic of questions.
AI is a branch of computer science that aims to help machines solve problems that are typically only solvable by the human mind. A variety of techniques, like machine learning, help accomplish this task: Machine learning involves training an algorithm to make decisions or predictions based on large data sets. The more data it is trained on, the more accurate its prediction will be. Machine learning has helped power accessible services like Google Translate and convenient ones like Netflix recommendation algorithms. When presented with astonishing advances in this technology, it bears reminding that AI is not inherently good nor bad, but instead a neutral tool. Look only to the world of chess for examples of its dual nature: IBM’s Deep Blue chess bot showed that human grandmasters could be outplayed; but this summer, an AI-programmed robot broke a seven-year-old’s finger during a chess tournament in Moscow.
As for AI text-to-image generators, their sinister nature is a little less clear-cut. Several open-source, free services have popped up this year; names like DALL-E and Midjourney especially ruled Twitter timelines for a few weeks, garnering a John Oliver segment. Midjourney, Stable Diffusion, and Google’s version, Imagen, exemplify a new wave of open-source, soon-to-launch programs capable of visually manifesting almost anything you can think of: Put in a line of text, as specific, as outlandish as you wish, and the algorithm will sift through thousands of data derived from images online to produce an original result as close to the prompt as possible. Prompts like “R2D2 getting baptized” or “Nosferatu on Rupaul’s Drag Race” demonstrate just how absurd the images can get. The idea is incredible. But is it art? And if so, what purpose does it serve?
AI has been marketed to the public as progressive and impressive by virtue of its ability to mimic human decision-making. Sun-ha Hong, an assistant professor who teaches about the implications of algorithms on society at Simon Fraser University, rejects that narrative.
“One of the greatest myths about AI is that, because we are told these technologies are so cutting edge and amazing, we tend to assume that every piece of it is cutting edge, world-class, and really well thought out,” Hong told me. “There are a few world-class, well-thought-out elements that are then held together with several metres of duct tape from the 1970s.”
Contrary to how many AI companies would have it appear, these generators do not possess anything close to human intelligence. When I spoke to Helen Hayes, a PhD candidate in McGill’s Department of Art History and Communication, she told me that, based on what these AI art generators actually do, “artificial intelligence” is in many ways a “misnomer.” Open-source AI art generators benefit greatly from the term’s cultural zeitgeist because users are lured in by the power of infinite possibility.
“The use of the term [AI] has so much social capital, so much financial capital, that it doesn’t actually matter if people know what’s going on behind that smokescreen,” Hayes said. “You can call DALL-E like, ‘the scraping of a dataset to produce a digital file that's [...] visually representative of the text that you've entered.’ But no one's gonna use it.”
For start-ups, the buzzword carries the promise of technological utopianism that puts companies at a disadvantage if they are not using AI in some way. In some cases, tech firms will even pretend to use AI to attract corporate interest and clients, while humans are really doing the work of the “bot.”
All the while, we humans should beware of attributing “intelligence” to any non-sentient machine. “Remember, AI has no idea what it is saying,” Hong reminded me. “It has no idea what art is, it has no idea what it faces. It has no idea what science is. It’s a monkey on a typewriter that's just copying and pasting things.”
The flimsiest aspect of art-generating algorithms is what they were trained on—or rather, what they weren’t. Companies like Midjourney and Stable Diffusion will tell you that their AI image generators can create anything within the limitless bounds of imagination. Time, labour, and expertise limit hand-made visual art. But the inputs of text and the set of images the algorithm was trained on are what limit algorithmically-generated outputs.
“If the AI systems are trained using a specific set of data, that data, which usually isn't like cleaned in a specific way to account for varying perspectives [...] includes a lot of what we call ‘awful but lawful’ data,” Hayes said.
Everything that is problematic about the internet, then—racism, misogyny, ableism, etc.— may be reproduced through algorithms trained on scraped image data that is not carefully filtered.
“They're not going through a representative sample of all the art created by anyone across the world, they’re scraping a bunch of things off the very surface of often the English American internet,” Hong explained.
If so, how could an algorithm fed only a small sliver of human
creativity produce anything truly original?
Before AI, the creativity inherent to artistic pursuits was infallible. So I reached out to a veteran artist to see what he thought about this newfangled technology. Back when McGill still offered courses in the Fine Arts, internationally acclaimed artist and architect Charles Gurd was an undergraduate studying psychology and still figuring out his career path. With a wealth of experience in multimedia, he sees digitally rendered, AI-generated art not as a threat to the industry, necessarily, but rather as a site of contemplation of the form.
At the basis of artistic expression, he says, is a “transmission of energy—way beyond the ‘concerns’ of the intellect—that operates on a level of unity of all things.” I understand what he means. Sometimes, writing electrifies me, as though a live wire keeps my fingers moving until I’ve transformed the inspiration into a sentence.
“So, as an artist, you have to go to that place and thereby become capable [...] of reaching viewers,” Gurd wrote to me. “[A]rt is this communication that everyone experiences in [a] common way [....] Art usually involves mystery/magic/surprise/chaos which is the basis of it all—life.”
In our email chain, Gurd looped in Gwendolyn Owens, director of visual arts at the McGill Library and Archives, who looked to the evolution of art forms throughout history.
“In the 20th century, we saw the advent of happenings and performances, readymades, and so much more. To some people, these advances were not art,” Owens wrote. “My view is that every artist can decide what they want to use (or not use) and every critic and collector can make their own decision as well. AI will not be the last change, there will be more. The debates are what make the world of art vibrant.”
I took Owens’s words as an opportunity to explore what AI-generated art had to offer, telling myself all the while that I could mark the pages of art history. But figuring out how to use these flawed platforms to your advantage is more complex than it appears.
I started imagining prompts to complement articles in the Tribune, with mixed results (successful examples shown here). Legality presented the first hurdle: I quickly learned that Midjourney creations for commercial use would cost $10 USD per month, with a maximum of around 200 generated images. Then there was the task of finding the right words. I was impressed and chastised by other users’ incredible aptitude for getting the AI to produce truly amazing images; prompts would be 25 lines long and include dozens of terms like “hyperrealistic,” “4k pixel,” “octane render”. They knew every trick in the book, while I was sitting there struggling to generate an image of a cat that didn’t resemble a misshapen badger.
Many users, like Sarah Tornai, have actually put the time and effort into learning the best techniques associated with prompt-making, even within a platform structured around data inequities.
“I feel like I’ve landed in a different dimension. Honestly having a slightly hard time wrapping my mind around all of this being real,” Tornai, who goes by MoyoMoz, wrote in a public post on the Midjourney server. “Been using [Midjourney] for about a week or two and I feel like someone just gave me access to a deep dream/desire I’ve always had but never knew was possible.”
The post triggered hundreds of community reactions, most of which were overwhelmingly positive. When I spoke to her about her experience using Midjourney to create portraits and landscapes of her home country of Mozambique, she described both wonder and frustration.
“Midjourney has opened up a world of creative possibilities to me that feel personally revolutionary,” Tornai wrote to me. “I’ve been working on images about Mozambique, which is an incredibly resource-rich and culturally rich country, but one of the “poorest” countries in the world economically because of the effects of colonization, historic racism[,] and past wars.”
“I’ve noticed that because AI is pulling images from the web[,] the images it generates of Mozambican people or houses [are] highly skewed towards very deep poverty even though a Mozambican middle class and upper class does exist. I’ve decided to incorporate that into my art because it’s actually part of the story I want to tell.”
Google’s Imagen site states that “[T]he data requirements of text-to-image models have led researchers to rely heavily on large, mostly uncurated, web-scraped datasets. While this approach has enabled rapid algorithmic advances in recent years, datasets of this nature often reflect social stereotypes, oppressive viewpoints, and derogatory, or otherwise harmful, associations to marginalized identity groups.” This disclaimer explains why they haven’t yet released Imagen for public use, but it does not excuse other platforms’ decisions to develop algorithms using any data that they can find for free.
The advent of text-prompt-generated artwork subverts the traditional
one-way relationship between an artist and their creation. The user is
responsible for the text prompt, the algorithm is responsible for
turning this input into an output, and the developer is responsible
for feeding the output possibilities to the algorithm. In the
Tribune’s publications of AI art, I referred to both myself—the
artist— and Midjourney AI—the tool—as the source of each image. But
really, who is the artist? Is it the person who imagines the text
prompt, in the unique string of words that only they have, the
intermediary algorithm, or the person who coded the machine and made
all of this possible? And more importantly, who is responsible when
the generated product is harmful?
One story that has been morally exhausted in the headlines is that of Jason M. Allen, a man who took home the first-place prize in the Colorado State Fair “emerging digital artist” category for his Midjourney-generated submission. Beyond eliciting outrage about the fairness of allowing Allen to submit a piece that was rendered by an “intelligent” machine, it worried artists whose labour and originality are remunerated. In a capitalist world, the fear of cheapening and expediting art creation is salient.
“I think what history tells us is that with new technologies, again, it's the low, low quality, cheap-cost, stuff that really makes the impact and not the high-quality virtuoso stuff,” Hong said. “We're not talking about three-star Michelin restaurants, we're talking about how McDonald’s revolutionizes the food business.”
Advertising content supports businesses best when it is cheap, fast, and plentiful. On a platform such as Midjourney, it takes no more than 15 seconds to generate a prompt. One can only imagine how content creators, advertisers, and even students could use this to optimize their time and subsequently generate an abundance of art with less care and thought behind it.
Hong worries that AI art could become a cost-effective way for businesses to make increasingly eye-catching advertisements designed to sell us things, not to inspire artistic thought or contemplation.
“We are going to be bombarded with more ugly, nonsensical, barely good enough art, in our buildings, in our books, and our album jackets, that's going to be bad news for anyone who's working in art and design and illustration,” Hong said.
Job automation has been both incredibly innovative for some and incredibly scary for manual labourers, and creatives, too. The proliferation of AI art could seriously disrupt the labour market for freelancers who are already struggling to secure stable employment and livable wages.
With great art comes great responsibility. As it stands, regulatory frameworks are not robust enough to curtail the evil power of manipulating what people perceive as real and using it for capital gains. “We're often met with this conundrum where tech advances very rapidly, [but] our policy moves very slowly. And so we're always responding to technological change rather than being ahead of it,” Hayes told me.
Results have yet to be released from a federal public consultation into copyright law and AI. Despite this, an AI art generator was legally registered as the co-author of an AI art piece, much to some law scholars’ consternation. The corporate pattern of releasing a product and sloughing off responsibility makes it even more challenging for the law to place blame when users abuse the product.
A much less colourful world lies ahead if a greater proportion of the graphic art that decorates our institutions, billboards, and furniture stores comes from the same pool of images regurgitated through an algorithm at the behest of overworked labourers.
But the sense of wonder generated by these AI creations and the idea of art for art’s sake cannot be ignored. Only when policy changes to protect creatives, and only if AI art companies keep their platforms accessible, will artists be able to do what artists do best: Use their medium to make sense of misery, critique systems of exploitation, and inspire change from the inside out.
Illustrations by Madison McLauchlan, Editor-in-Chief and Midjourney AI