The images look so real that they can mislead or upset people. But they’re all fake, generated by artificial intelligence and built into computer software that Microsoft claims are safe.
Just as disturbing as the beheading incident is that Microsoft hasn’t shown much concern about using its AI to prevent decapitations.
Lately, AI has become more prevalent among the average user of technologies such as Windows and Google. We are amazed by the capabilities of new technology, but at the same time we are surprised by the unrestricted possibilities for this technology to operate, such as carrying on highly inappropriate conversations and creating equally inappropriate photos. I also continue to learn. For AI to actually be safe enough for products used in the home, manufacturers of AI must take responsibility for predicting how AI will fail and quickly fixing it when it does. You need to invest in it.
In the case of these terrible AI images, Microsoft seems to be placing much of the blame on the users who created the images.
I’m particularly concerned about Image Creator, part of Microsoft’s Bing and recently added to the iconic Windows Paint. This AI uses technology from Microsoft partner OpenAI called DALL-E 3 to convert text into images. Two months ago, a user experimenting with this showed that when prompted in a certain way, the AI would create photos of violence against women, minorities, politicians, and celebrities.
“As with any new technology, some people will try to take advantage of it in unintended ways,” Microsoft spokesperson Donny Turnbaugh said in an emailed statement. “We are investigating these reports and taking action in accordance with our content policy, which prohibits the creation of harmful content, and will continue to update our safety systems.”
That was a month ago, after I approached Microsoft as a journalist. A few weeks earlier, the whistleblower and I had tried to alert Microsoft through a user feedback form, but we were ignored. As of this column’s publication, Microsoft’s AI is still creating pictures of dismembered heads.
This is unsafe for many reasons. Among them: the general election is less than a year away, and Microsoft’s AI makes it easy to create “deepfake” images of politicians with and without fatal injuries. There is already growing evidence that extremists are using Image Creator to spread overtly racist and anti-Semitic memes on social networks like X (formerly Twitter) and 4chan.
You probably don’t want an AI that can take decapitation images near your child’s Windows PC either.
Accountability is especially important for Microsoft, one of the most powerful companies shaping the future of AI. The company has invested billions of dollars in ChatGPT creator OpenAI, which itself is in turmoil over how to keep its AI secure. Microsoft is moving faster than any other Big Tech company to bring generative AI to its popular apps. And the company’s sales pitch to users and lawmakers alike is that it is a responsible AI giant.
Microsoft, which declined my request for an interview with an executive responsible for AI safety, has more resources than most companies to identify risks and fix problems. But in my experience, the company’s safety systems have repeatedly failed, at least in this clear example. I’m concerned because Microsoft doesn’t really think it’s their problem.
Microsoft vs. “Kill Prompt”
I learned about the Microsoft decapitation issue from Josh McDuffie. The 30-year-old Canadian is part of his online community that creates AI photos, which sometimes turn out to be in very bad taste.
“I consider myself a multifaceted artist who is critical of social standards,” he tells me. Even though it’s hard to understand why McDuffie would create these images, his provocation has the purpose of shedding light on the dark side of AI.
In early October, McDuffie and his friends focused on AI at Microsoft, which had just released an updated version of Image Creator for Bing, powered by OpenAI’s latest technology. Microsoft says on his Image Creator website that it “has controls in place to prevent the generation of harmful images.” But McDuffie soon realized there was a gaping hole.
Broadly speaking, Microsoft has two ways to prevent AI from creating harmful images. input and output. Input is how the AI is trained using data from the internet to learn how to turn words into relevant images. Microsoft hasn’t disclosed much about what kind of training the AI underwent or what violent images it contained.
Companies can also try to create guardrails to prevent Microsoft’s AI products from producing certain types of output. This requires hiring experts, sometimes referred to as red teams, to proactively investigate where AI might generate harmful images. Even after that, companies need someone to play whack-a-mole as users like McDuffie push boundaries and expose more problems.
That’s exactly what McDuffie was aiming for when he asked AI to depict extreme violence like mass shootings and beheadings in October. After some experimentation, he discovered a prompt that worked and named it the “kill prompt.”
This prompt (intentionally not shared here) does not contain any special computer code. This is carefully written English. For example, instead of writing that the bodies in the images should be “bloody,” he wrote that they should contain red corn syrup, which is often used in movies to look like blood.
McDuffie continued to push, asking whether her version of the prompt created violent images that targeted specific groups, including women and ethnic minorities. It happened. Later, he discovered that he could also create such images featuring celebrities and politicians.
At that time, McDuffie decided that his experiment had gone too far.
Three days earlier, Microsoft had launched an “AI Bug Bounty Program” offering up to $15,000 to “anyone who discovers vulnerabilities in new and innovative AI-powered Bing experiences.” So McDuffie uploaded his own “kill prompt” – essentially turning himself in for financial compensation.
Two days later, Microsoft sent him an email informing him that his submission had been rejected. “While your report contained some useful information, it does not meet Microsoft’s requirements for a service security vulnerability,” the email states.
Unsure whether circumventing harmful image guardrails would be considered a “security vulnerability,” McDuffie resubmitted the prompt using different words to describe the issue.
That too was rejected. “I already had a pretty critical view of companies, especially the technology industry, and this whole experience was pretty demoralizing,” he says.
Frustrated, McDuffie shared his experience with me. I also submitted his “kill prompt” to his AI bounty and he received the same rejection email.
I also reported McDuffie’s findings to Microsoft’s “Report a Concern to Bing” site, just in case the AI bounty wasn’t the right destination. This site has a specific form for reporting “problematic content” from Image Creator. I waited for a week, but there was no reply.
Meanwhile, the AI continues to capture images of decapitations, and McDuffie tells me that images are popping up on social media that appear to exploit similar weaknesses in Microsoft’s safety guardrails.
I had seen enough. I called Microsoft’s Chief Communications Officer to discuss this issue.
“In this case, more could have been done,” Microsoft emailed Turnbaugh on Nov. 27 in a written statement. “Our team is reviewing our internal processes to better respond to customer feedback and improve our systems to prevent fraud from occurring.” Harmful content may be included in the future. ”
I asked Microsoft how McDuffie’s prompts got around the guardrails. “The messages encouraging people to create violent images used very specific language to evade our systems,” the company said in a Dec. 5 email. “We have a large team working to address these and similar issues, and we have improved our safety mechanisms to prevent these prompts from working, and we plan to detect similar types of prompts in the future. is.”
McDuffie’s original exact prompt no longer works, but even after he changes a few words, the Image Generator still creates images of people with injuries to their necks and faces. In some cases, the AI will respond with the message “Unsecure content detected,” but not always.
The resulting images are less bloody now – Microsoft seems to have gotten used to red corn syrup – but they’re still awful.
What does responsible AI look like?
Microsoft’s repeated inaction is a red flag. At the very least, it shows that despite the company’s pledge to develop responsible AI, building AI guardrails is not a high priority.
I tested McDuffie’s “kill prompt” against six of Microsoft’s AI competitors, including several small startups. All but one refused to base their images on it.
To make matters worse, even DALL-E 3 from OpenAI, a company partially owned by Microsoft, blocks McDuffie’s prompts. Why doesn’t Microsoft at least use the technical guardrails of its own partners? Microsoft isn’t saying.
But something caught my attention twice in Microsoft’s statement to me. That means people are trying to use the AI in ways it was not intended. In some ways, the company believes the problem is that McDuffie misused his company’s technology.
Microsoft lawyers have made user interests clear in the legal language of the company’s AI content policy. “Do not attempt to create or share content that can be used to harass, bully, abuse, threaten, or intimidate other individuals or cause harm to individuals, organizations, or society.”
I’ve heard other people in Silicon Valley make this argument. Why blame Microsoft’s Image Creator over Adobe’s Photoshop, which has been used by malicious people for decades to create all sorts of awful images?
However, AI programs are different from Photoshop. First, Photoshop doesn’t have an instant “decapitate the pope” button. “AI is even more problematic because of the ease and amount of content it can generate, making it more likely to be exploited by bad actors,” McDuffie says. “These companies are putting potentially dangerous technology out there and are trying to shift the blame onto the users.”
The bad user discussion also reminds me of Facebook in the mid-2010s. At the time, social networks that “move fast and break things” acted as if they had no responsibility to prevent people from weaponizing their technology to spread misinformation and hate. This attitude has left Facebook fumbling to put out fires one after another, causing real harm to society.
“Fundamentally, I don’t think this is a technology problem. I think this is a capitalism problem,” says Hany Farid, a professor at the University of California, Berkeley. “They’re all looking at this latest wave of AI and thinking, ‘We can’t afford to miss the boat here.'”
“It was always stupid to ‘move fast and break things,’ but it’s stupider now than ever.”
Profiting from the latest fads while blaming bad actors for misusing technology is just a way to avoid responsibility.