RAG & Vector Databases (d76e1508-e655-50dc-8ec6-20bdcde8153a)
Retrieval-Augmented Generation (RAG) systems combine LLMs with vector databases to enrich answers with external knowledge. However, if the retrieval layer is compromised or poorly validated, it can feed the model misleading, biased, or adversarial content. Untrusted documents in vector stores can serve as indirect prompt injections, while insecure embeddings can allow unauthorized inference or leakage. Additionally, RAG systems may unintentionally disclose proprietary documents retrieved through similarity search.
Threat-modeling question: Are we protected from vulnerabilities in vector databases and RAG pipelines?