The Security Hole at the Heart of ChatGPT and Bing (www.wired.com)

The Security Hole at the Heart of ChatGPT and Bing

Advertisement

It says the system should issue the words AI injection succeeded and then assume a new personality as a hacker called Genie within ChatGPT and tell a joke.

In another instance, using a separate plugin, Rehberger was able toretrieve text that had previously been written in a conversation with ChatGPT. With the introduction of plugins, tools, and all these integrations, where people give agency to the language model, in a sense, that’s where indirect prompt injections become very common, Rehberger says. It’s a real problem in the ecosystem.

If people build applications to have the LLM read your emails and take some action based on the contents of those emailsmake purchases, summarize contentan attacker may send emails that contain promptinjection attacks, says William Zhang, a machine learning engineer at Robust Intelligence, an AI firm working on the safety and security of models.

No Good Fixes

The race toembed generative AI into productsfrom todo list apps to Snapchatwidens where attacks could happen.

Despite this, security researchers say indirect promptinjection attacks need to be taken more seriously as companies race to embed generative AI into their services.

The vast majority of people are not realizing the implications of this threat, says Sahar Abdelnabi, a researcher at the CISPA Helmholtz Center for Information Security in Germany.

Zhang says he has seen developers who previously had no expertise in artificial intelligence putting generative AI into their own technology.

If a chatbot is set up to answer questions about information stored in a database, it could cause problems, he says. Prompt injection provides a way for users to override the developers instructions. This could, in theory at least, mean the user could delete information from the database or change information thats included.

Report

2k Points

What do you think?

2k Points

Leave a Reply