AI-Generated Video of Mona Lisa Singing Along to a Rap Song Sparks Strong Online Reactions

The internet has strongly reacted to an artificial intelligence-generated video of the famous subject of the Mona Lisa singing along to a rap that actress Anne Hathaway wrote and performed.

The polarizing clip, which has elicited reactions online ranging from humor to horror, is one of the tricks of Microsoft’s new AI technology called VASA-1. The technology is able to generate lifelike talking faces of virtual characters using a single image and speech audio clip. The AI can make cartoon characters, photographs, and paintings sing or talk, as evidenced in examples Microsoft released as part of a blog post published on April 16.

In the most viral clip, the woman in the Mona Lisa painting sings, her mouth, eyes and face moving, to “Paparazzi,” a song Hathaway wrote and performed on The Late Show with David Letterman in 2011. In another Microsoft clip, an avatar sings, and in others generated from real photos, people speak on common topics.

The videos quickly gained traction online: One post on Twitter, formerly known as X, on April 18 featuring the Mona Lisa had garnered seven million views as of Sunday.

Microsoft just dropped VASA-1.This AI can make single image sing and talk from audio reference expressively. Similar to EMO from Alibaba10 wild examples:1. Mona Lisa rapping Paparazzi — Min Choi (@minchoi)

The online reactions were swift, strong and across the board. Some enjoyed the clips, with one saying the Mona Lisa video had them “rolling on (the) floor laughing.” Others were more wary or even disturbed. “This is wild, freaky, and creepy all at once,” one said. “Another day, another terrifying AI video,” one said. “Why does this need to exist? I can’t think of any positives,” one tweet read.

Microsoft’s researchers addressed the risks of the new technology and said they have no plans to release an online demo or product “until we are certain that the technology will be used responsibly and in accordance with proper regulations.”

“It is not intended to create content that is used to mislead or deceive,” the researchers wrote. “However, like other related content generation techniques, it could still potentially be misused for impersonating humans. We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection.”

“While acknowledging the possibility of misuse, it’s imperative to recognize the substantial positive potential of our technique,” they said. “The benefits—such as enhancing educational equity, improving accessibility for individuals with communication challenges, offering companionship or therapeutic support to those in need, among many others—underscore the importance of our research and other related explorations. We are dedicated to developing AI responsibly, with the goal of advancing human well-being.”

The latest development in AI comes as governments around the world work to regulate the new technology and legislate against its criminal misuse, such as deepfake pornography where the face of an individual is superimposed onto an explicit picture or video without their consent, an issue that even affected Facebook earlier this year. In the U.S., while 10 states criminalize deepfakes, federal law does not, and multiple bills aim to rectify this.