Package hallucinations represent a common issue within code-generating Large Language Models (LLMs) that opens the door for a new type of supply chain attack, researchers from three US universities warn.
Referred to as ‘slopsquatting’, package hallucination occurs when the code generated by a LLM recommends or references a fictitious package.
Researchers from the University of Texas at San Antonio, University of Oklahoma, and Virginia Tech warn that threat actors can exploit this by publishing malicious packages with the hallucinated names.
“As other unsuspecting and trusting LLM users are subsequently recommended the same fictitious package in their generated code, they end up downloading the adversary-created malicious package, resulting in a successful compromise,” the academics explain in a recently published research paper (PDF).
Considered a variation of the classical package confusion attack, slopsquatting could lead to the compromise of an entire codebase or software dependency chain, as any code relying on the malicious package could end up being infected.
The academics show that, out of 16 popular LLMs for code generation, none was free of package hallucination. Overall, they generated 205,474 unique fictitious package names. Most of the hallucinated packages (81%) were unique to the model that generated them.
For commercial models, hallucinated packages occurred in at least 5.2% of cases, and the percentage jumped to 21.7% for open source models. These hallucinations are often persistent within the same model, as 58% would repeat within 10 iterations.
The academics also warn that, while the risk of LLMs recommending malicious or typosquatted packages has been documented before, it was considered low, and that the concept of package hallucinations was not considered. With the rapid evolution of Gen-AI, the risks linked to its use have also escalated.
The researchers conducted 30 tests (16 for Python and 14 for JavaScript) to produce a total of 576,000 code samples. During evaluation, the models were prompted twice for package names, resulting in a total of 1,152,000 package prompts.
“These 30 tests generated a total of 2.23 million packages in response to our prompts, of which 440,445 (19.7%) were determined to be hallucinations, including 205,474 unique non-existent packages,” the academics note.
The academics also discovered that the models were able to detect most of their own hallucinations, which would imply that “each model’s specific error patterns are detectable by the same mechanisms that generate them, suggesting an inherent self-regulatory capability.”
To mitigate package hallucination, the researchers propose prompt engineering methods such as Retrieval Augmented Generation (RAG), self-refinement, and prompt tuning, along with model development techniques such as decoding strategies or supervised fine-tuning of LLMs.
Related: AI Now Outsmarts Humans in Spear Phishing, Analysis Shows
Related: Vulnerabilities Expose Jan AI Systems to Remote Manipulation
Related: Google Pushing ‘Sec-Gemini’ AI Model for Threat-Intel Workflows
Related:Microsoft Bets $10,000 on Prompt Injection Protections of LLM Email Clien