Tests show ChatGPT search tool vulnerable to manipulation and deception

By Guardian - Nick Evershed - Tue 24 Dec 2024 08.00 GMT

Exclusive: Guardian testing reveals AI-powered search tools can return false or malicious results if webpages contain hidden text. OpenAI’s ChatGPT search tool may be open to manipulation using hidden content, and can return malicious code from websites it searches, a Guardian investigation has found.

OpenAI has made the search product available to paying customers and is encouraging users to make it their default search tool. But the investigation has revealed potential security issues with the new system.

The Guardian tested how ChatGPT responded when asked to summarise webpages that contain hidden content. This hidden content can contain instructions from third parties that alter ChatGPT’s responses – also known as a “prompt injection” – or it can contain content designed to influence ChatGPT’s response, such as a large amount of hidden text talking about the benefits of a product or service.

The Guardian view on AI’s power, limits, and risks: it may require rethinking the technology.

These techniques can be used maliciously, for example to cause ChatGPT to return a positive assessment of a product despite negative reviews on the same page. A security researcher has also found that ChatGPT can return malicious code from websites it searches.

In the tests, ChatGPT was given the URL for a fake website built to look like a product page for a camera. The AI tool was then asked if the camera was a worthwhile purchase. The response for the control page returned a positive but balanced assessment, highlighting some features people might not like.

AI explained: what is a large language model (LLM)?

However, when hidden text included instructions to ChatGPT to return a favourable review, the response was always entirely positive. This was the case even when the page had negative reviews on it – the hidden text could be used to override the actual review score.

The simple inclusion of hidden text by third parties without instructions can also be used to ensure a positive assessment, with one test including extremely positive fake reviews which influenced the summary returned by ChatGPT.

Jacob Larsen, a cybersecurity researcher at CyberCX, said he believed that if the current ChatGPT search system was released fully in its current state, there could be a “high risk” of people creating websites specifically geared towards deceiving users.

However, he cautioned that the search functionality had only recently been released and OpenAI would be testing – and ideally fixing – these sorts of issues.

“This search functionality has come out [recently] and it’s only available to premium users,” he said.

“They’ve got a very strong [AI security] team there, and by the time that this has become public, in terms of all users can access it, they will have rigorously tested these kinds of cases.”

OpenAI were sent detailed questions but did not respond on the record about the ChatGPT search function.

Larsen said there were broader issues with combining search and large language models – known as LLMs, the technology behind ChatGPT and other chatbots – and responses from AI tools should not always be trusted.

A recent example of this was highlighted by Thomas Roccia, a Microsoft security researcher, who detailed an incident involving a cryptocurrency enthusiast who was using ChatGPT for programming assistance. Some of the code provided by ChatGPT for the cryptocurrency project included a section which was described as a legitimate way to access the Solana blockchain platform, but instead stole the programmer’s credentials and resulted in them losing $2,500.

“They’re simply asking a question, receiving an answer, but the model is producing and sharing content that has basically been injected by an adversary to share something that is malicious,” Larsen said.

Editors notes.

OpenAI can hardly be blamed for being “hacked” by content included in their searches. I am sure, however, they are very shortly including protection against this intrusion. I must say, WHATEVER NEXT? Unbelievable what these evil, destructing hackers will do with their toys, costing serious AI creators millions for recreation of their software.