When you discover that a tool you trusted isn’t doing what you thought it was, you feel a certain kind of unease. That’s roughly where a lot of security teams find themselves today, staring at dashboards built around static network analysis while AI-driven phishing campaigns slice right through them at a pace nobody really saw coming.
Recent threat reports consistently show a startling 1,380% rise in AI-enabled phishing attacks. That figure comes out of research tied to the EvilTokens campaign, a device code phishing operation that researchers say is deliberately designed to hide its attack flow inside the browser, specifically to avoid triggering the kinds of static detection tools most enterprise environments still rely on. It’s a clever move. And the uncomfortable part is that it’s working.
Static analysis, for years, has been the backbone of how security teams monitor network behavior. You capture traffic, you compare it against known patterns, you flag anomalies. When threats are slow-moving and reasonably predictable, the reasoning makes sense. The problem is that phishing campaigns powered by large language models are neither of those things. They avoid the keyword-based or signature-based detection that the majority of static tools rely on, they adapt almost instantly, and they produce content that appears to be identical to authentic communication. It’s hard to get rid of the impression that defenders have been getting ready for last year’s battle as you watch this unfold in the wild.

This asymmetry is explained fairly precisely in a paper written by Monash University researchers and published through arXiv. According to their findings, attackers use LLMs to create phishing content through a nine-stage taxonomy, where each stage denotes a point where safety precautions can be circumvented. The study demonstrates that offensive tools based on LLMs are intrinsically dynamic. It adapts, learns, and changes course. On the other hand, defensive tactics are still frequently static and reactive, designed to identify past events rather than foresee future ones.
The EvilTokens campaign serves as an example of how that vulnerability is actually exploited. The campaign avoids leaving the kind of network-layer fingerprints that static analysis tools are designed to detect by rerouting attack flows through browser-based channels. The majority of tools are just not searching in the proper location. Many organizations might not become aware of their exposure until much later, if at all.
This isn’t just a technological issue. It is structural in part. Monitoring tools that generate clear reports and quantifiable coverage metrics have long been prioritized in security budgets. Static analysis checks those boxes. It can be audited. It’s explainable to a board. In contrast, even though dynamic behavioral analysis is more appropriate for the real threat, it is messier and more difficult to justify on a spreadsheet.
Meanwhile, the broader breach landscape from recent weeks suggests the exposure is real and wide. Major incidents ranging from a 14.2 million account data breach at KDDI, a ransomware attack leaking Apple documents from Tata Electronics, to confirmed vendor breaches at LastPass have reinforced the same basic point: attackers are finding ways in that defenders weren’t watching closely enough.
There’s a sense among researchers and practitioners that the field is at an inflection point, not a comfortable one. When it comes to threats that move more quickly, adapt more intelligently, and specifically target the gaps left by static systems, the tools that performed fairly well over the past ten years are beginning to show their limitations. It will likely be necessary to have a more difficult discussion about whether the detection infrastructure that most organizations use was ever intended for what phishing has subtly evolved into in order to close those gaps.
