Peeling away the layers of machinery inside the wooden cabinet of the Mechanical Turk reveals ‘merely’ a man, a pearl ensconced within an artificial shell. The Mechanical Turk was an elaborate cabinet with a mannequin attached that posed as a chess-playing machine. Behind a screen of gears and gizmos, a human chess master huddled beneath the board, manipulating the pieces with a magnet. The illusion was convincing enough that it duped luminaries like Napoleon and Benjamin Franklin into believing they were matching wits with a thinking machine. However, the ‘artificial’ intelligence within was only artifice. Almost two centuries later, authentic chess-playing computers exist capable of beating all human grandmasters—not just as a fluke, but every time. Chess is a game of logic, though, and computers are nothing if not avatars of logic—not so, a man. Man is a thing of passions, driven by forces within to create. Surely, the Arts stands as a bastion for humanity. Enter OpenAI’s revolutionary DALLE-2 to challenge man’s dominance, at least in the realm of creating unique, high-quality artwork.
As the digit on its moniker implies, DALLE-2 is not the first of its kind, but it is the first to make the skeptic search wildly for the man inside the Mechanical Turk. DALLE-2 is an artificial-intelligence driven Application Programming Interface (API) that allows a business or end-user access to OpenAI’s mechanical magician; enter a few plain-English words in a text prompt, akin to typing a query into a search engine, and ten unique images return. These images are generated using an artificial intelligence technique that connects hundreds of millions of images to associated text descriptions, and compares these images mathematically to isolate the salient features when contrasted with other images (see a detailed description of the process here). What is so magical about DALLE-2 in comparison to previous AI image-generation ventures is that the images actually look good, to the point where the inattentive could easily mistake them for works by a real artist.
While OpenAI’s own site has plenty of images created by DALLE-2 and the powerful computers to which it connects, the skeptic would again assume that these are only cherry-picked examples of the best it has to offer. However, For real-world examples of DALLE-2 in action, the place to go is the DALLE-2 subreddit, where users blessed by OpenAI with a coveted invite post the best responses to their text prompts (user-created content, potentially NSFW). At the time of this writing, I see Darth Vader inserted into Grant Wood’s American Gothic, “Singing in the Rain” starring velociraptors, and the Statue of Liberty wearing a VR headset, among thousands of other prompts. As as user scrolls through these images, one after another, it begins to dawn that this is a transformative tech just beginning to surface, one that will devour entire fields of human endeavor, quickly and ruthlessly.
Potential early uses are both mundane and profound. Royalty-free stock photographs of just about anything will be generated with ease. Authors will be able to illustrate their works painlessly. Game designers will be able to use this tool to make their vision come to life. Musicians will be able to create album covers. Human artists will use the digitally generated artwork as a base, cleaning up the images and adding detail, speeding the pace of their own work—for a while, at least. What happens when the student exceeds the teacher, as the chess AIs eventually did? Chess is more popular than ever, even with no hope of ever dethroning the machines.
In the not-so far future, DALLE-2 and similar tech will likely be able to generate longer, more coherent content. AI will generate entire movies, entire games, entire novels and comic books and television shows.
AI will also be used to create nightmares. OpenAI is wary of the potential for misuse already. In the terms of service for the API are prohibitions against using it for:
- (i) Illegal activities, such as child pornography, gambling, cybercrime, piracy, violating copyright, trademark or other intellectual property laws;
- (ii) Accessing or authorizing anyone to access the APIs from an embargoed country, region, or territory as prohibited by the U.S. government;
- (iii) Threatening, stalking, defaming, defrauding, degrading, victimizing or intimidating anyone for any reason.
Each of these bullet points is the gestating embryo of at least one potential dystopia. Creating photo-realistic images at a whim will lead to a world where you cannot trust what you see, in the same way that you can no longer trust what you read. The implications on the legal system, at the very least, will be catastrophic. When photo-realistic content of any person, engaging in any act imaginable, can be created by anyone, how will this be used by our enemies? Will infinite blackmail material lead to a world of social Mutually-Assured-Destruction, where civility will win out when anyone’s reputation can be annihilated by anyone else. Or will we simply tune it out, ignoring every scandal as fake because we can no longer tell what is real or not.
OpenAI takes steps to guard against some of these nightmares. Up until today, DALLE-2 would not generate photo-realistic images of people’s faces, instead typically generating a blurry mess. It is not trained on explicitly violent or pornographic content. But many users consider these measures akin to sticking a finger in the hole in the dam. A few months after DALLE-2’s creation, the restrictions are largely holding. However, as more and more users gain access to the system and more and more competitors arise, including Google’s rival model Imagen, it seems like only a matter of time until this new creation spins out of control. During the writing of this article, Google posted new images that rival or even surpass those created by DALLE-2.
Under near-infinite layers of digital accretion still lies that pearl of humanity; human politics and human passions will decide what becomes of DALLE-2 and its kindred. Will the Elon Musk-founded OpenAI dominate the field, deciding which images are appropriate to create, or will it be Google, perhaps bringing with it the same political baggage that it promotes with its “Google Doodles”. These companies are already expressing concerns that the AI is ‘reinforcing bias’, for example by making images of flight attendants female and construction workers male.
You can experiment yourself with AI image generation immediately using DALLE Mini. The images you create will not be as jaw-dropping as those from DALLE-2 (DALLE Mini is not made by OpenAI), but they can give you a hint at the power and potential of this fairly new technology. You can also sign up for the waiting list to try DALLE-2 out for yourself.
Jason Winesburg is a Montessori Teacher for students ages 9-12, and wears many hats. Jason moved to Amelia Island last year to enjoy the freedom and the beauty.