Large Language Models (LLMs) like GPT-4 are trained on massive data corpora to generate human-like responses. They have rapidly been adopted in critical sectors—from assisting doctors in healthcare to automating financial analytics in finance, and optimizing workflows in automation. Since the launch of ChatGPT, there has been an explosion of both commercial and open-source LLMs. The latter have taken the world by “surprise” due to the rise of DeepSeek models, which have shocked the AI space (Global, 2025). As organizations integrate LLMs into real-world systems, the development and deployment pipeline of these models has become increasingly complex, involving many components and stakeholders. This end-to-end pipeline is often referred to as the LLM supply chain (Wang et al., 2024 , encompassing everything from data collection and model training to application integration and infrastructure. Given LLMs’ growing importance and reach, it’s crucial to address security risks throughout their supply chains, not just within the model itself. A vulnerability in any segment (be it tampered training data, a backdoored model, or an insecure API integration) can jeopardize the entire system.
For these reasons, let’s break down some of the security concerns that may arise in each stage of the LLM supply chain.

Fig 1: Overview of LLM Supply Chain (Huang et al., 2024)
The stages and interactions of different stakeholders can be seen in Fig. 1; all the components that interact with each other become quite confusing to follow. However, it is imperative to ultimately understand questions like: “How secure is my LLM once it’s integrated into my application?” and “What threats arise in each stage of the LLM supply chain that may impact my deployment and integration?” Thus, the following sections explain potential threats to aid developers, users, and policymakers in considering security risks beyond just the LLM model itself. Bear in mind that the LLM supply chain is complex, and here we provide a simplified view of its main components. Readers are encouraged to consult additional sources for more specialized interests.
Data Threats

LLMs require massive amounts of training data, but this same scale creates unique security exposures:
- Data Poisoning: Attackers alter or insert malicious samples to warp an LLM’s behavior. Even a small fraction of tainted data can introduce hidden backdoors. In practice, tampering with widely used public datasets (e.g. Github, Kaggle, etc) can stealthily embed triggers.
- Data labelling attacks: This could be in the form of label flipping attacks (Li et al., 2023) or clean-label attacks (Liang et al., 2023).
- Filter Bypass: Dataset filters remove profanity or private information, but clever obfuscation of banned content can slip past. Consequently, LLMs may inadvertently learn toxic language or sensitive data.
Mitigation Tips:
- Version control and strict provenance checks of the data
- Routine scanning for irregular patterns or injected triggers
- Layered data cleaning, including automated tools and manual spot checks
Model Threats

After data is collected, foundation models are particularly susceptible to stealthy compromises such as (but not limited to):
- Backdoor Insertion: Attackers with pipeline access can inject code or tweak the loss function so the model behaves normally except under a secret trigger (Qiang et al., 2024).
- Pre-training: During this stage threats like embedding malicious behavior in the LLM during pretraining, or extracting training data by analyzing gradients (gradient leakage attacks)during distributed model trainig .
- Quantization Attacks: Model compression (e.g., 8-bit quantization) can activate hidden malicious behaviors. In one demonstration, a model functioned innocuously at full precision but became hostile when quantized (Egashira et al., 2024).
Mitigation Tips:
- Strictly limit and log access to training pipelines and pretraining artifacts
- Verify model checkpoints (e.g., checksums) before and after training
- Use specialized “backdoor detection” methods to spot anomalous triggers
Application Threats

At deployment, the LLM is integrated into real-world applications and user-facing interfaces, creating multiple new attack surfaces, examples include:
- Prompt Injection & Jailbreaking: Clever inputs can override system prompts or policies, causing models to reveal secrets or produce disallowed outputs (Lakera, 2024). Attackers hide malicious commands in user prompts or external content, effectively “hijacking” the model’s instructions.
- API Security & Authentication Gaps: Many LLMs operate via cloud APIs. If API keys or credentials are compromised, attackers can query or reconfigure the model at will. Poor session handling can also leak private user data.
- Orchestration & Plugin Exploits: Advanced setups let LLMs call external tools (e.g., code executors or web browsers). As orchestrators such as LlamaIndex (or LangChain) are commonly used to build RAG applications, the ability to retrieve external data and use various plugins opens new avenues of attack for adversaries. Even a hijacked prompt can instruct the LLM to perform dangerous actions if these tools are not sandboxed or monitored.
Mitigation Tips:
- Sanitize or mediate all user input to reduce injection risk
- Never rely solely on model-based access checks, enforce robust authentication outside the LLM
- Carefully sandbox or restrict LLM access to critical external resources
Conclusions
LLMs offer substantial benefits across industries, but they bring novel security challenges. Attackers can exploit anything from data ingestion to model compression or plugin orchestration. Preventative measures like backdoor detection, robust authentication, and secure DevOps are key to mitigating these emerging threats. By embracing a comprehensive, multi-layered security strategy, we can foster trust in the powerful technology behind LLMs and ensure these models continue to drive innovation safely.
“When even a single link in the chain of innovation lies unguarded, we open the gates to the forces of mischief.”
— Jose Luna —
References:
- Global, C. (2025, February 14). How DeepSeek is redefining the future of AI. ThinkChina – Big Reads, Opinion & Columns on China. https://www.thinkchina.sg/technology/how-deepseek-redefining-future-ai
- Wang, S., Zhao, Y., Hou, X., & Wang, H. (2024). Large Language Model Supply Chain: a research agenda. ACM Transactions on Software Engineering and Methodology. https://doi.org/10.1145/3708531
- Huang, K., Chen, B., Lu, Y., Wu, S., Wang, D., Huang, Y., Jiang, H., Zhou, Z., Cao, J., & Peng, X. (2024, October 28). Lifting the veil on the large language model Supply chain: composition, risks, and mitigations. arXiv.org. https://arxiv.org/abs/2410.21218
- Li, Q., Wang, X., Wang, F., & Wang, C. (2023). A label flipping attack on machine learning model and its defense mechanism. In Lecture notes in computer science (pp. 490–506). https://doi.org/10.1007/978-3-031-22677-9_26
- Liang, J., Zhang, X., Shang, Y., Guo, S., & Li, C. (2023). Clean-label Poisoning Attack against Fake News Detection Models. 2021 IEEE International Conference on Big Data (Big Data), 3614–3623. https://doi.org/10.1109/bigdata59044.2023.10386777
- Qiang, Y., Zhou, X., Zade, S. Z., Roshani, M. A., Khanduri, P., Zytko, D., & Zhu, D. (2024, February 21). Learning to poison large language models during instruction tuning. arXiv.org. https://arxiv.org/abs/2402.13459
- Egashira, K., Vero, M., Staab, R., He, J., & Vechev, M. (2024, May 28). Exploiting LLM quantization. arXiv.org. https://arxiv.org/abs/2405.18137
- Prompt Injection & the Rise of Prompt Attacks: All You Need to Know | Lakera – Protecting AI teams that disrupt the world. (n.d.). https://www.lakera.ai/blog/guide-to-prompt-injection