
Gandalf the Red: What 279K Real Attacks Reveal About LLM Defense
AI Security Cluster

Gandalf the Red: What 279K Real Attacks Reveal About LLM Defense

MITRE ATLAS: The ATT&CK Framework for AI Systems

LLMail-Inject: What 208K Attacks Against an Email Agent Found

How RLHF and Constitutional AI Build Safety Into Language Models

Adversarial Machine Learning: From Szegedy to LLM Attacks

Multiagent LLM Security: When Your Agent Talks to a Malicious Agent

Differential Privacy for LLMs: The Training Privacy Guarantee

LLM Watermarking: How Models Embed Detection Signals in Their Outputs

Neural Backdoor Attacks: From BadNets to LLM Trojans

LLM Training Data Memorization: When Models Leak Their Training Sets

MCP Server Security: Prompt Injection and Tool Poisoning

LLM Supply Chain Attacks: PoisonGPT to Poisoned Skills

Red-Teaming LLM Applications: A Practitioner’s Framework

Jailbreaking vs Prompt Injection: Two Different LLM Problems

OWASP LLM Top 10 for 2025: The Mechanism Behind Each Vulnerability

Indirect Prompt Injection: The Attack That Hides in Your Data

LLM Excessive Agency: Why Every Tool Your Agent Has Is a Risk

Julia Bazinska and the Science of Measurable AI Security
Editor’s Selection
Hand-picked deep reads
Four pieces that defined this cycle. Mechanism-first analysis, primary sources, the limitations everyone else skipped.
Latest Coverage
Recent coverage on agent infrastructure, governance, benchmarks, and security.

North Korea’s Contagious Interview Operation Expanded to Five Package Ecosystems

When Your AI Agent Loses Your Money, Who Pays?

Full Context Sets the Accuracy Ceiling for AI Agent Memory. It Costs 26,000 Tokens.

A Federal Judge Just Ruled Your Claude Chats Are Evidence.
From the Archive
Foundational coverage
Pillar pieces from the early publishing era. Mechanism-first, primary sources, no hype.

OpenAI Killed Sora. The Unit Economics Were Never Going to Work.

Mistral Gave Away a Voice AI That Matches the $11 Billion Incumbent.

Perplexity AI’s Hidden Trackers: How an Incognito Search Engine Shared Conversations

Anthropic Sent Every Subscriber a Credit. For Some, It Covers One Day of the Price Increase.
Topic Clusters
Seven areas of deep coverage. Mechanism-first, primary sources, honest limitations.
AI Security
Attacks, defenses, and frameworks for securing language model deployments at scale.
Agent Infrastructure
How production AI agents are built, sandboxed, and run at scale.
MCP Security & Governance
The 97M-download protocol, the 23 attack vectors, the regulated standards.
Models & Benchmarks
The benchmarks that actually differentiate. The models that lead them.
Developer Tools & Coding Agents
How coding agents are architected, what they can do, and where they break.
AI Safety & Biorisk
Where capability uplift meets dual-use risk. Studies, frameworks, ASL designations.
Markets & Capital
Funding, IPOs, supply chains, the unit economics behind the AI buildout.





