Machine-learning programs that can produce sometimes jaw-dropping images from brief text prompts have advanced in a matter of months from a “that’s quite a trick” stage to a genuine cultural disruption.
Why it matters: These new AI capabilities confront the world with a mountain of questions over the rights to the images the programs learned from, the likelihood they will be used to spread falsehoods and hate, the ownership of their output and the nature of creativity itself.
- Their rapid evolution has also raised concerns over the future of jobs for graphic designers, illustrators and anyone else whose income is tied to the production of images.
Driving the news: Open AI’s Dall-E 2 kicked this revolution off earlier this year, as programmers, journalists and artists granted early access to the program flooded social media with examples of its work.
- Last month, another program, Stability AI’s Stable Diffusion, arrived that offered a similar level of image-making prowess with way fewer restrictions.
Between the lines: Where Dall-E 2 is owned and controlled by Open AI, Stable Diffusion is open source, meaning anyone with a little skill can download and run it on their own systems.
- That means that, while Dall-E 2 is now beginning to charge users, Stable Diffusion is essentially free.
- Where Open AI made some effort to limit questionable uses of its program — it says it has tried to reduce racial and gender biases in its training data, and it blocks users from creating violent images, porn, and celebrity pix — Stability AI is taking a mostly “anything goes” approach.
- Stability AI CEO Emad Mostaque told TechCrunch the program’s training data was “based on general web crawls and are designed to represent the collective imagery of humanity… Aside from illegal content, there is minimal filtering, and it is on the user to use it as they will.”
- “This is amazing technology that can transform humanity for the better and should be open infrastructure for all,” he added.
Our thought bubble: Multiple generations of digital tech have been rushed into large-scale use only to cause a wide range of social woes. AI was supposed to be different, but Stable Diffusion’s “damn the torpedoes” approach suggests an “if it can be done, it will be done” mindset will prevail again.
- That means we will have to cope with the reality-eroding impact of increasingly realistic fabrications sooner rather than later.
Catch up quick: The new image-producing AIs don’t cut and paste from existing images but instead use a technique called “diffusion” that evolves a random dot pattern iteratively to produce a result that matches a pattern.
- A similar technique lies behind the strikingly realistic texts and conversations generated by Open AI’s GPT-3 program.
- All these pattern-matching machine learning programs need to be “trained” on vast pools of data.
- What’s in all that data, where it came from, what problems and biases it might contain, and whether anyone should be compensated for its use are all questions that both devotees and critics of these programs are now asking.
It’s not always easy to figure out what images a particular AI “learned” from, but blogger and XOXO Festival co-creator Andy Baio worked with programmer Simon Willison to create a tool for exploring one small portion of the training set data Stability AI used to train Stable Diffusion.
- The set seems to contain a considerable volume of copyrighted material and original artworks by artists who had no idea their work was being used in this way.
Meanwhile, art competitions and sites devoted to showcasing artists’ work are grappling with how to cope with a potential flood of AI-generated images, with some online communities beginning to ban them.
Here are three key ethics questions around this new software explosion, courtesy of Baio:
- “Is it ethical to train an AI on a huge corpus of copyrighted creative work, without permission or attribution?”
- “Is it ethical to allow people to generate new work in the styles of the photographers, illustrators, and designers without compensating them?”
- “Is it ethical to charge money for that service, built on the work of others?”
Flashback: Technology has been reshaping the arts forever because art-making itself is a kind of technology.
- The introduction of photography in the 19th century didn’t end the work of painters, but it pushed them away from naturalism. Cinema did the same to theater. Recording changed music. Digital special effects continue to transform moviemaking.
- Art, by definition, involves an intentional act of human expression. AI can’t do that by itself. But it can certainly change the economics of the low-end image-production business — stock photo and illustration agencies should buckle up.
What’s next: These programs are likely to keep getting more efficient and less distinguishable from magic. Their collisions with our existing systems of intellectual property law are likely to be epic.