AuditAgent vs Azimuth: Static AI vs Behavioral Analysis in Smart Contract Security

AI tools are rapidly becoming part of the smart contract security workflow. Some teams experiment with them internally, while others are beginning to rely on them as a first line of defense before manual audits.

But a fundamental question remains: how well do these tools actually perform when analyzing real codebases?

To explore that question, we ran a comparison between two systems:

AuditAgent — a static AI auditing tool focused on vulnerability detection
Azimuth — a behavioral analysis engine designed to simulate exploit paths and protocol interactions

Rather than evaluating a single repository, we tested both systems across four different codebases, each representing a different category of smart contract architecture.

Repositories Analyzed

PrimeVault — DeFi protocol (view report)
LendMachine — Lending protocol (view report)
Murky — Utility library (view report)
BaseTap — Payment protocol (view report)

Each repository was analyzed independently by both systems and compared across several dimensions:

vulnerability detection
exploit modeling
protocol reasoning
workflow analysis
code-quality observations

The results were instructive.

Methodology

For each repository we:

Ran AuditAgent to generate a full audit report
Ran Azimuth to generate behavioral exploit hypotheses
Compared the outputs across several dimensions:
- number of findings
- exploit depth
- cross-contract reasoning
- economic attack modeling
- operational failure modes

Importantly, these were unmodified repositories analyzed in their original state.

Repository 1 — PrimeVault

Full analysis: view Azimuth report

PrimeVault is a DeFi protocol with vault mechanics and capital flows between multiple contracts.

This kind of architecture introduces several attack surfaces:

asset accounting
permission controls
economic manipulation
cross-contract interactions

Capability

AuditAgent

Azimuth

Static vulnerability detection

⭐⭐⭐⭐

Cross-function exploit discovery

⭐

⭐⭐⭐⭐⭐

Protocol economic modeling

⭐

⭐⭐⭐⭐⭐

Multi-contract reasoning

⭐⭐

⭐⭐⭐⭐⭐

Implementation & best-practice feedback

⭐⭐⭐

AuditAgent correctly identified several isolated contract risks and best-practice violations.

Azimuth expanded those issues into realistic exploit paths, including scenarios where attackers could manipulate protocol flows across multiple contracts.

This is a recurring theme: static scanners can identify risky code patterns, but often stop short of modeling how those risks translate into real attacks.

Repository 2 — LendMachine

Full analysis: view Azimuth report

LendMachine is a simplified lending protocol with collateral, borrowing, liquidation, and reward mechanics.

These systems are especially sensitive to economic exploits, where small logic flaws can create large financial consequences.

Capability

AuditAgent

Azimuth

Static vulnerability detection

⭐⭐⭐⭐

Reentrancy detection

⭐⭐⭐

⭐⭐⭐⭐⭐

Cross-function exploit discovery

⭐

⭐⭐⭐⭐⭐

Economic attack modeling

⭐

⭐⭐⭐⭐⭐

Protocol reasoning

⭐

⭐⭐⭐⭐⭐

Both tools detected a configuration risk around interest rate control.

AuditAgent noted that the interest rate setter lacked access control. Azimuth went further and modeled several exploit scenarios:

artificially inflating interest rates to force liquidations
temporarily setting rates to zero to disable accrual
manipulating borrower health factors during liquidation windows

Additionally, Azimuth identified issues in reward accounting synchronization, which could lead to phantom reward accumulation under certain conditions.

These types of vulnerabilities are difficult for static scanners to detect because they require reasoning about state transitions across multiple transactions.

Repository 3 — Murky

Full analysis: view Azimuth report

Murky is not a protocol at all.

It is a Merkle tree utility library used primarily for testing and proof generation.

That makes it an interesting control case.

Because Murky has:

no capital flows
no incentives
no multi-contract architecture

...the number of meaningful attack surfaces is naturally limited.

Capability

AuditAgent

Azimuth

Static vulnerability detection

⭐⭐⭐⭐

⭐⭐⭐

Merkle logic analysis

⭐⭐⭐⭐

⭐⭐⭐⭐⭐

Edge-case reasoning

⭐⭐⭐

⭐⭐⭐⭐⭐

Protocol exploit modeling

⭐

In this case, both tools performed similarly.

AuditAgent produced a larger set of code hygiene observations, including style issues and gas optimizations.

Azimuth focused more on edge cases in Merkle proof verification, such as malformed trees and integration misuse.

But the differences were much smaller than in protocol repositories. This is expected. When the codebase is a simple utility library, there are simply fewer opportunities for exploit modeling to add value.

Repository 4 — BaseTap

Full analysis: view Azimuth report

BaseTap is a modular payment protocol designed around taps, which allow controlled token flows between accounts.

The system includes:

tap registries
execution contracts
payment sessions
batching logic
split payments

This architecture introduces several workflow risks.

Capability

AuditAgent

Azimuth

Static vulnerability detection

⭐⭐⭐⭐

Access-control analysis

⭐⭐⭐⭐

⭐⭐⭐⭐⭐

Workflow reasoning

⭐⭐⭐

⭐⭐⭐⭐⭐

Payment-flow exploit modeling

⭐⭐

⭐⭐⭐⭐⭐

Cross-contract reasoning

⭐⭐⭐

⭐⭐⭐⭐⭐

AuditAgent identified several important issues, including:

missing authorization checks
inconsistencies between canExecute() and executeTap()
architectural design weaknesses

Azimuth expanded these into attack scenarios affecting real users.

a payment session could be griefed by malicious actors calling markPaid() before legitimate settlement
ETH transfers could become permanently locked when interacting with ERC20 tap paths
tap owners could inflate payment amounts after users grant approvals

These are not simply coding errors. They are product trust failures, where legitimate users could be harmed even though the contract technically behaves as written.

Cross-Repository Comparison

Looking across the four repositories reveals a consistent pattern.

Capability

AuditAgent

Azimuth

Static vulnerability detection

⭐⭐⭐⭐

Cross-function exploit discovery

⭐

⭐⭐⭐⭐⭐

Protocol economic modeling

⭐

⭐⭐⭐⭐⭐

Workflow / state-machine reasoning

⭐⭐

⭐⭐⭐⭐⭐

Implementation & best-practice feedback

⭐⭐⭐

Each tool excels in different areas.

AuditAgent strengths

strong static analysis
best-practice detection
architectural hygiene

Azimuth strengths

exploit path modeling
economic attack analysis
multi-contract reasoning
workflow failure detection

What This Means for AI Auditing

The results suggest an important distinction between two categories of AI security tools.

Static AI auditors

These tools behave similarly to traditional vulnerability scanners.

They excel at identifying:

reentrancy risks
missing access control
unsafe patterns
implementation issues

But they often struggle to reason about:

multi-step attacks
economic incentives
protocol workflows

Behavioral security engines

Systems like Azimuth focus less on pattern matching and more on simulating how contracts behave under adversarial conditions.

This enables them to surface vulnerabilities that appear only when:

multiple transactions interact
cross-contract calls occur
incentives are manipulated

The Bigger Picture

Smart contract security is evolving.

Early tools focused on code correctness.

Modern protocols require analysis of economic behavior and system interactions.

Both layers matter.

Static scanners are valuable for quickly catching implementation mistakes.

But as protocols grow more complex, security tools must also understand:

how users interact with systems
how attackers manipulate incentives
how state transitions create unexpected behavior

Conclusion

Across the four repositories we analyzed — PrimeVault, LendMachine, Murky, and BaseTap — a consistent pattern emerged.

Static AI auditors were effective at identifying common vulnerability patterns and implementation risks.

But the most meaningful issues surfaced when the analysis moved beyond individual functions and began modeling how contracts behave as a system.

Many of the highest-impact findings depended on:

multi-step interactions
cross-contract workflows
economic incentives
real user behavior

Static analysis is an important first layer of defense, but modern smart contract exploits rarely arise from a single unsafe line of code.

They emerge from how components interact over time.

Static analysis tells you where the code looks risky.

Dynamic analysis tells you how the system actually breaks.

As smart contracts continue to grow in complexity, behavioral analysis will increasingly become a necessary complement to static scanning in serious security workflows.