Tag: LLMs
-
Red-Teaming LLM Applications: A Practitioner’s Framework
LLM red-teaming spans three distinct surfaces: model layer (jailbreaking), application layer (injection), and supply chain. Different attacks, different defenses, different responsible parties. Here is the methodology that covers…
-
LLM Supply Chain Attacks: PoisonGPT to Poisoned Skills
PoisonGPT used $1 of compute to pass benchmarks with modified facts. The April 2026 PoisonedSkills paper tested the same supply chain logic against Claude Code and Gemini CLI.…
-
Jailbreaking vs Prompt Injection: Two Different LLM Problems
Jailbreaking targets model content policy. Prompt injection targets application architecture. The defenses don’t overlap, the responsible parties differ, and the same RLHF training that resists jailbreaks amplifies injection…
-
MCP Server Security: Prompt Injection and Tool Poisoning
MCPoison and CurXecute (CVE-2025-54136 and 54135) exploited the same MCP architectural gap: tool description fields loaded at agent boot with no sanitization. Here is the tools/list mechanism, the…
-
LLM Excessive Agency: Why Every Tool Your Agent Has Is a Risk
Every tool an LLM agent has is an attack surface. OWASP’s LLM06 and the b3 benchmark across 31 models show why: capability scope determines blast radius. Here is…
-
OWASP LLM Top 10 for 2025: The Mechanism Behind Each Vulnerability
The OWASP LLM Top 10 for 2025 added System Prompt Leakage and Vector Weaknesses, reworked Excessive Agency, and moved Sensitive Disclosure to second place. Here is the architectural…
-
Indirect Prompt Injection: The Attack That Hides in Your Data
Indirect prompt injection lets attackers hijack LLMs by hiding instructions in documents, web pages, and tool results the model processes. Here is why the architecture makes this unavoidable…
-
Julia Bazinska and the Science of Measurable AI Security
Julia Bazinska built the empirical tools that make LLM security measurable. From DeepMind RL to first-authoring b3, here is what her research at Lakera actually produced.
-
Gandalf the Red: What 279K Real Attacks Reveal About LLM Defense
Lakera’s ICML 2025 paper ran 279K crowdsourced attacks to show what synthetic red-teaming misses. The D-SEC finding: system prompts degrade user experience without blocking attackers. Here is the…
-
Chinchilla Scaling Laws: Three Methods and Why Labs Ignore Them
Chinchilla proved GPT-3 was undertrained. The 20:1 rule is a training-compute floor. Three methods, their disagreements, and why frontier labs now exceed it.
-
LoRA and QLoRA: Fine-Tuning Large Models on One GPU
LoRA fine-tunes 70B models on one GPU using low-rank weight updates. The intrinsic dimension proof, rsLoRA scaling fix, and where LoRA falls short.
-
Speculative Decoding: How LLMs Generate 3x Faster
Speculative decoding achieves 3-4x LLM speedup with zero output quality loss. The math proof, EAGLE-2’s 4.26x result, and when it does not help.
-
LLMs in Veterinary Clinical Practice: What the Evidence Actually Shows
ChatGPT-4.5 scored 90% on feline eye disease cases vs 96.7% for experienced veterinary ophthalmologists and significantly outperformed novices (56-67%). Where LLMs add clinical value in veterinary practice, where…
-
Poisoning the Medical Brain: RAG Attacks and Security in Clinical AI Systems
Clinical LLMs failed prompt injection at 94% in JAMA testing. RAG systems face a harder attack: poisoned retrieved documents that the LLM cannot distinguish from legitimate sources. How…
-
Radiology Foundation Models: What Merlin, the 22% Hallucination Rate, and ED Fracture Data Tell Us
Stanford published Merlin in Nature: a CT foundation model tested on 44,098 scans across 3 institutions. Meanwhile 22% of AI radiology reports contain factual errors and LLMs miss…
-
Poisoning the Medical Brain: How RAG Attacks Corrupt Biomedical AI
When the knowledge base is the attack surface. RAG poisoning allows adversaries to redirect medical AI outputs without touching model weights. Five arXiv papers explain the mechanism and…
-
Prompt Injection Succeeds 94% of the Time Against Clinical LLMs
A JAMA Network Open study found prompt injection attacks succeed 94.4% of the time against clinical LLMs, including 91.7% in high-harm pregnancy drug scenarios. Based on PubMed-indexed research,…
-
LLMs Give Novice Biologists 4x Uplift on Dangerous Tasks
A 2026 study measured LLM access giving novice biologists a 4.16x accuracy boost on biosecurity-relevant tasks, including beating expert baselines. Here is the mechanism and what it means…
-
MiniMax M2.7 Optimized Its Own Training Harness 100 Times. Here Is the Loop.
MiniMax M2.7 ran an internal agent that modified its own training scaffold 100 times in a row without human input and gained 30% on internal evaluations. Here is…
-
M-Trends 2026: Exploits Now Arrive Before Patches. The Mean Time-to-Exploit Is Negative 7 Days.
Mandiant M-Trends 2026 documents a mean time-to-exploit of negative 7 days. 28.3% of CVEs are being exploited within 24 hours of disclosure. Here is the AI attack chain…




















You must be logged in to post a comment.