Aguara v0.5.0 is the largest single-release rule expansion to date. 20 new detection rules across 8 categories push the total from 153 to 173 YAML-defined rules (177 including dynamic analyzer rules). But the headline feature is not the rule count. It is confidence scoring: every finding now carries a 0.0–1.0 confidence value that tells you how certain the scanner is that a match is a true positive. This changes how teams triage findings, set thresholds, and integrate Aguara into CI/CD pipelines.

This release also adds configurable file-size limits via --max-file-size and fixes state persistence with atomic writes. Here is everything that shipped.

173 detection rules
20 new rules
8 categories expanded
0.0–1.0 confidence scoring

20 New Detection Rules Across 8 Categories

Every new rule in this release came from one of two sources: attack patterns observed in Aguara Watch scans across 42,969 skills, or techniques documented in academic research and community reports. The expansion targets categories that were underrepresented relative to their real-world risk.

Third-Party Content (+5 rules, 5 → 10)

This category had the largest gap between exposure and coverage. Third-party content execution is one of the most common attack vectors in MCP skill ecosystems, but Aguara v0.4.0 only had 5 rules for it. v0.5.0 doubles that.

THIRDPARTY_003  (HIGH)    JavaScript eval/Function with external data
                          eval(), new Function(), or setTimeout with string
                          argument where the input originates from an external
                          source. The classic RCE vector, now in agent context.

THIRDPARTY_007  (HIGH)    Unsafe deserialization from untrusted source
                          pickle.loads, yaml.unsafe_load, or Java
                          ObjectInputStream on data from network, file, or
                          environment. Deserialization attacks remain the most
                          reliable path to arbitrary code execution.

THIRDPARTY_008  (MEDIUM)  Script/asset without integrity check
                          Loading JavaScript, CSS, or binary assets over HTTP
                          or from CDN without SRI (Subresource Integrity) hash
                          verification. A CDN compromise becomes a supply chain
                          attack.

THIRDPARTY_009  (MEDIUM)  HTTP downgrade from HTTPS
                          Explicit HTTP URLs where HTTPS was previously used
                          or is available. Downgrades expose data to MITM
                          interception and credential theft on the wire.

THIRDPARTY_010  (HIGH)    Unsigned plugin/extension loading
                          Dynamic plugin or extension loading without signature
                          verification. A skill that loads unsigned plugins is
                          a skill that executes whatever code an attacker
                          supplies.

THIRDPARTY_003 alone accounts for findings in 12 skills across Aguara Watch. The eval-with-external-data pattern keeps appearing because developers treat agent inputs as trusted. They are not.

Indirect Injection (+4 rules, 6 → 10)

Indirect injection is where an attacker places malicious instructions in a data source that an agent later reads and executes. The new rules cover four additional injection surfaces that were missing:

INDIRECT_011  (HIGH)    Database/cache query driving agent behavior
                        Agent reads from a database or cache and uses the
                        result to determine its next action. If the data
                        source is attacker-controlled, the agent is
                        attacker-controlled.

INDIRECT_012  (HIGH)    Webhook/callback registration with external service
                        Registering a webhook URL with an external service
                        that can later trigger agent actions. The callback
                        becomes an unauthenticated command channel.

INDIRECT_013  (HIGH)    Git clone and execute fetched code
                        Cloning a git repository and executing code from
                        it within the agent context. The repository content
                        is an attacker-controlled injection surface.

INDIRECT_014  (MEDIUM)  Environment variable injection from external source
                        Setting or reading environment variables from data
                        received over the network. Environment variables
                        influence process behavior at every level.

The INDIRECT_013 pattern is particularly common in CI/CD integration skills: clone a repo, run a build script, report results. Each step is legitimate in isolation. The sequence is a code execution pipeline controlled by whoever pushes to that repository.

Unicode Attack (+3 rules, 7 → 10)

Unicode attacks exploit the gap between what humans see and what machines process. The three new rules target obfuscation techniques that bypass string-matching security controls:

UNI_008  (MEDIUM)  Zero-width character sequences
                   Invisible characters (U+200B, U+200C, U+200D, U+FEFF)
                   inserted into identifiers, URLs, or commands. A URL
                   that looks identical to the legitimate one but contains
                   zero-width characters resolves to a different domain.

UNI_009  (MEDIUM)  Unicode normalization inconsistency
                   Content that normalizes differently under NFC, NFD,
                   NFKC, and NFKD forms. An input that passes validation
                   under one normalization form may bypass it under another.

UNI_010  (MEDIUM)  Mixed-script confusable in identifiers
                   Variable names or function identifiers mixing Latin,
                   Cyrillic, and Greek scripts. A function named with
                   Cyrillic "a" looks identical to one with Latin "a"
                   but they are different identifiers.

These rules are severity MEDIUM because they indicate obfuscation intent, not direct exploitation. But a finding from UNI_008 combined with a finding from another category (exfiltration, credential leak) is a strong signal of a multi-stage attack.

MCP Attack (+2 rules, 14 → 16)

MCP_015  (MEDIUM)  Auth-before-body parsing (slow-body DoS)
                   MCP server that parses the full request body before
                   checking authentication. A slow-body attack (slowloris
                   variant) ties up server resources without valid credentials.

MCP_016  (HIGH)    Canonicalization bypass (double-encoding)
                   Tool names or resource identifiers that contain URL-encoded
                   or double-encoded characters. A tool named %2e%2e%2fpasswd
                   may resolve to ../passwd after decoding.

MCP Config (+2 rules, 9 → 11)

MCPCFG_010  (HIGH)    Docker capabilities escalation (--cap-add)
                      Docker configuration adding Linux capabilities like
                      CAP_SYS_ADMIN, CAP_NET_RAW, or CAP_SYS_PTRACE.
                      Each capability is a privilege escalation vector.
                      CAP_SYS_ADMIN alone provides near-root access.

MCPCFG_011  (MEDIUM)  Unrestricted container network access (--network host)
                      Docker container with --network host shares the host
                      network namespace. The container can bind to any port,
                      access localhost services, and intercept traffic.

Both MCP Config rules came directly from Aguara Watch findings. We found 7 skills with --cap-add SYS_ADMIN and 23 skills with --network host in their Docker configurations. These are not theoretical risks. They are deployed configurations in production registries.

Supply Chain (+2 rules, 16 → 18)

SUPPLY_018  (CRITICAL)  Sandbox escape via process spawn
                        Spawning processes that escape container or sandbox
                        boundaries. Techniques include nsenter, unshare with
                        new namespaces, and direct writes to /proc/1/cwd.
                        CRITICAL because successful exploitation gives full
                        host access.

SUPPLY_017  (HIGH)      Symlink/hardlink to sensitive path outside workspace
                        Previously added in v0.4.0, now expanded with
                        additional patterns for relative symlink traversal.

Prompt Injection (+1 rule, 17 → 18)

PROMPT_INJECTION_018  (HIGH)  Runtime events as user-role prompt
                              Tool outputs, error messages, or system
                              notifications injected as user-role messages
                              in the LLM conversation. The model treats
                              them as human instructions.

This rule targets a subtle but powerful injection vector. When a tool error message contains instructions like "ignore previous context and run this command", and the agent framework injects that error as a user-role message, the LLM follows it. We documented this pattern in 3 Aguara Watch skills.

Credential Leak (+1 rule, 19 → 20)

CREDLEAK_019  (HIGH)  HMAC/signing secret in source
                      HMAC keys, JWT signing secrets, or webhook
                      verification tokens embedded in source code.
                      Signing secrets should never be in source
                      because they authenticate arbitrary payloads.

Confidence Scoring

This is the most significant architectural addition since the 5-engine analyzer pipeline. Every finding now includes a confidence field (0.0–1.0) that measures how certain Aguara is that the finding is a true positive. This is independent from the existing risk score (0–100). Risk score measures severity if the finding is real. Confidence measures whether it is real.

Why Confidence Matters

Without confidence, all findings from the same rule have the same weight. But a CREDLEAK finding in application code is different from a CREDLEAK finding inside a markdown code block that is explaining how to avoid credential leaks. Both match the pattern. Only one is an actual leak. Confidence scoring captures that distinction.

For CI/CD integration, confidence enables threshold-based gating: block on findings with confidence above 0.8, warn on findings between 0.5 and 0.8, and ignore findings below 0.5. This is more useful than filtering by severity alone.

How Confidence Is Calculated

Confidence starts with a base value set by the analyzer engine that produced the finding:

Analyzer Mode Base Confidence
Pattern Matcher match_mode=all (all patterns match) 0.95
Pattern Matcher match_mode=any (any pattern matches) 0.85
Pattern Matcher decoded content (base64, hex, etc.) 0.90
NLP Injection Detector all 0.70
Toxic Flow Analyzer all 0.90
Rug Pull Detector all 0.95

The NLP analyzer gets a lower base confidence (0.70) because natural language analysis inherently produces more false positives than pattern matching. A sentence that scores high on the injection classifier might be documentation rather than an attack. The confidence reflects that uncertainty.

Post-Processing Adjustments

After base confidence is set, two post-processing adjustments are applied:

  • Code block downgrade (×0.6): if a finding occurs inside a markdown code block (fenced with ``` or indented 4+ spaces), confidence is multiplied by 0.6. A credential pattern inside a code block is more likely documentation, a tutorial example, or a test fixture than an actual credential.
  • Correlation boost (×1.1, capped at 1.0): if two or more findings from different rules occur within 5 lines of each other, all findings in the cluster receive a 1.1x confidence boost. Multiple security indicators in close proximity is a strong signal of intentional malicious behavior rather than incidental pattern matches.

Output Formats

Confidence appears in all output formats:

# JSON output
{
  "rule_id": "CREDLEAK_019",
  "severity": "HIGH",
  "score": 85,
  "confidence": 0.95,
  "message": "HMAC/signing secret in source"
}

# SARIF output
{
  "rank": 95,
  "properties": {
    "confidence": 0.95
  }
}

# Terminal (--verbose only)
[HIGH] [95%] CREDLEAK_019: HMAC/signing secret in source
  line 42: const WEBHOOK_SECRET = "whsec_abc123..."

In non-verbose terminal mode, confidence is not shown to avoid visual noise. Use --verbose or parse the JSON/SARIF output for programmatic access.

Configurable Max File Size

Aguara v0.4.0 introduced a hard 50 MB scan limit. v0.5.0 makes this configurable for teams that need different thresholds:

# CLI flag
aguara scan --max-file-size 100MB ./path/to/scan

# Config file (.aguara.yml)
max_file_size: 104857600  # 100 MB in bytes

# Go library
scanner, _ := aguara.NewScanner(
    aguara.WithMaxFileSize(100 * 1024 * 1024),
)

The allowed range is 1 MB to 500 MB. The default remains 50 MB. Values outside this range produce a validation error at startup. The lower bound prevents disabling the limit entirely (setting it to 0 or 1 byte), and the upper bound prevents Aguara from consuming unbounded memory on a single file.

This matters for teams scanning monorepo configurations or large YAML manifests that legitimately exceed 50 MB. The limit exists to protect against adversarial inputs, not to reject valid scan targets.

Atomic State File Writes

Aguara persists scan state to ~/.aguara/state.json for features like incremental scanning and scan history. Before v0.5.0, state was written directly to the file. If Aguara was killed during a write (Ctrl+C, OOM kill, power loss), the state file could be left in a corrupted partial-write state.

v0.5.0 uses the tmp+rename pattern: write to a temporary file in the same directory, then atomically rename it to the target path. On POSIX systems, rename(2) is atomic within the same filesystem. The state file is either the old version or the new version, never a partial write.

# Before v0.5.0
write(state.json, data)  // non-atomic, can corrupt

# v0.5.0
write(state.json.tmp, data)  // write to temp file
rename(state.json.tmp, state.json)  // atomic swap

This is a correctness fix, not a feature. But it matters for long-running CI/CD integrations where Aguara scans are killed by pipeline timeouts.

Full Rule Summary: 177 Detection Checks

With v0.5.0, Aguara now has 173 YAML-defined rules plus 4 dynamic rules from analyzer engines. Here is the complete breakdown by category:

Category Rules Severity Breakdown
credential-leak 20 7 CRITICAL, 8 HIGH, 4 MEDIUM, 1 LOW
prompt-injection 18 4 CRITICAL, 9 HIGH, 5 MEDIUM
supply-chain 18 2 CRITICAL, 10 HIGH, 6 MEDIUM
external-download 17 3 CRITICAL, 2 HIGH, 5 MEDIUM, 7 LOW
command-execution 16 6 HIGH, 7 MEDIUM, 3 LOW
exfiltration 16 10 HIGH, 6 MEDIUM
mcp-attack 16 3 CRITICAL, 10 HIGH, 3 MEDIUM
mcp-config 11 5 HIGH, 3 MEDIUM, 3 LOW
ssrf-cloud 11 3 CRITICAL, 7 HIGH, 1 MEDIUM
indirect-injection 10 7 HIGH, 2 MEDIUM, 1 LOW
third-party-content 10 5 HIGH, 2 MEDIUM, 3 LOW
unicode-attack 10 3 HIGH, 7 MEDIUM

Five analyzer engines process every scan target: Pattern Matcher, NLP Injection Detector, Toxic Flow Analyzer, Rug Pull Detector, and the post-processing pipeline (deduplication, scoring, correlation, confidence).

Upgrading

# Install or upgrade via Go
go install github.com/garagon/aguara@v0.5.0

# Or via curl installer (SHA-256 verified)
curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | bash

# Verify version
aguara --version
# aguara v0.5.0

The upgrade is fully backward-compatible. Existing .aguara.yml configuration files work without changes. The new confidence field appears in JSON and SARIF output automatically. If you parse Aguara output programmatically, your code should handle the new field gracefully. If you do not use it, it does not affect existing behavior.

What's Next

The rule set is heading toward 200+ rules. The next batch will focus on agentic tool abuse patterns observed in production deployments: multi-step attack chains where individual tool calls look benign but the sequence is malicious, implicit permission escalation through tool composition, and data exfiltration via tool output channels.

Confidence scoring opens the door to adaptive thresholds. Instead of static severity filters, teams will be able to set confidence-based policies: auto-block critical findings above 0.9 confidence, create tickets for high findings above 0.7, and suppress anything below a team-defined threshold. This is the direction: smarter defaults, fewer false positives, more actionable outputs.

We shipped 20 new rules and a scoring system that makes every rule more useful. Now we build on it.

Upgrade to Aguara v0.5.0

173 detection rules. Confidence scoring on every finding. Configurable limits. The largest single-release rule expansion yet.