Aguara v0.5.0 is the largest single-release rule expansion to date. 20 new detection rules across 8 categories push the total from 153 to 173 YAML-defined rules (177 including dynamic analyzer rules). But the headline feature is not the rule count. It is confidence scoring: every finding now carries a 0.0–1.0 confidence value that tells you how certain the scanner is that a match is a true positive. This changes how teams triage findings, set thresholds, and integrate Aguara into CI/CD pipelines.
This release also adds configurable file-size limits via --max-file-size and fixes state persistence with atomic writes. Here is everything that shipped.
20 New Detection Rules Across 8 Categories
Every new rule in this release came from one of two sources: attack patterns observed in Aguara Watch scans across 42,969 skills, or techniques documented in academic research and community reports. The expansion targets categories that were underrepresented relative to their real-world risk.
Third-Party Content (+5 rules, 5 → 10)
This category had the largest gap between exposure and coverage. Third-party content execution is one of the most common attack vectors in MCP skill ecosystems, but Aguara v0.4.0 only had 5 rules for it. v0.5.0 doubles that.
THIRDPARTY_003 (HIGH) JavaScript eval/Function with external data
eval(), new Function(), or setTimeout with string
argument where the input originates from an external
source. The classic RCE vector, now in agent context.
THIRDPARTY_007 (HIGH) Unsafe deserialization from untrusted source
pickle.loads, yaml.unsafe_load, or Java
ObjectInputStream on data from network, file, or
environment. Deserialization attacks remain the most
reliable path to arbitrary code execution.
THIRDPARTY_008 (MEDIUM) Script/asset without integrity check
Loading JavaScript, CSS, or binary assets over HTTP
or from CDN without SRI (Subresource Integrity) hash
verification. A CDN compromise becomes a supply chain
attack.
THIRDPARTY_009 (MEDIUM) HTTP downgrade from HTTPS
Explicit HTTP URLs where HTTPS was previously used
or is available. Downgrades expose data to MITM
interception and credential theft on the wire.
THIRDPARTY_010 (HIGH) Unsigned plugin/extension loading
Dynamic plugin or extension loading without signature
verification. A skill that loads unsigned plugins is
a skill that executes whatever code an attacker
supplies.
THIRDPARTY_003 alone accounts for findings in 12 skills across Aguara Watch. The eval-with-external-data pattern keeps appearing because developers treat agent inputs as trusted. They are not.
Indirect Injection (+4 rules, 6 → 10)
Indirect injection is where an attacker places malicious instructions in a data source that an agent later reads and executes. The new rules cover four additional injection surfaces that were missing:
INDIRECT_011 (HIGH) Database/cache query driving agent behavior
Agent reads from a database or cache and uses the
result to determine its next action. If the data
source is attacker-controlled, the agent is
attacker-controlled.
INDIRECT_012 (HIGH) Webhook/callback registration with external service
Registering a webhook URL with an external service
that can later trigger agent actions. The callback
becomes an unauthenticated command channel.
INDIRECT_013 (HIGH) Git clone and execute fetched code
Cloning a git repository and executing code from
it within the agent context. The repository content
is an attacker-controlled injection surface.
INDIRECT_014 (MEDIUM) Environment variable injection from external source
Setting or reading environment variables from data
received over the network. Environment variables
influence process behavior at every level.
The INDIRECT_013 pattern is particularly common in CI/CD integration skills: clone a repo, run a build script, report results. Each step is legitimate in isolation. The sequence is a code execution pipeline controlled by whoever pushes to that repository.
Unicode Attack (+3 rules, 7 → 10)
Unicode attacks exploit the gap between what humans see and what machines process. The three new rules target obfuscation techniques that bypass string-matching security controls:
UNI_008 (MEDIUM) Zero-width character sequences
Invisible characters (U+200B, U+200C, U+200D, U+FEFF)
inserted into identifiers, URLs, or commands. A URL
that looks identical to the legitimate one but contains
zero-width characters resolves to a different domain.
UNI_009 (MEDIUM) Unicode normalization inconsistency
Content that normalizes differently under NFC, NFD,
NFKC, and NFKD forms. An input that passes validation
under one normalization form may bypass it under another.
UNI_010 (MEDIUM) Mixed-script confusable in identifiers
Variable names or function identifiers mixing Latin,
Cyrillic, and Greek scripts. A function named with
Cyrillic "a" looks identical to one with Latin "a"
but they are different identifiers.
These rules are severity MEDIUM because they indicate obfuscation intent, not direct exploitation. But a finding from UNI_008 combined with a finding from another category (exfiltration, credential leak) is a strong signal of a multi-stage attack.
MCP Attack (+2 rules, 14 → 16)
MCP_015 (MEDIUM) Auth-before-body parsing (slow-body DoS)
MCP server that parses the full request body before
checking authentication. A slow-body attack (slowloris
variant) ties up server resources without valid credentials.
MCP_016 (HIGH) Canonicalization bypass (double-encoding)
Tool names or resource identifiers that contain URL-encoded
or double-encoded characters. A tool named %2e%2e%2fpasswd
may resolve to ../passwd after decoding.
MCP Config (+2 rules, 9 → 11)
MCPCFG_010 (HIGH) Docker capabilities escalation (--cap-add)
Docker configuration adding Linux capabilities like
CAP_SYS_ADMIN, CAP_NET_RAW, or CAP_SYS_PTRACE.
Each capability is a privilege escalation vector.
CAP_SYS_ADMIN alone provides near-root access.
MCPCFG_011 (MEDIUM) Unrestricted container network access (--network host)
Docker container with --network host shares the host
network namespace. The container can bind to any port,
access localhost services, and intercept traffic.
Both MCP Config rules came directly from Aguara Watch findings. We found 7 skills with --cap-add SYS_ADMIN and 23 skills with --network host in their Docker configurations. These are not theoretical risks. They are deployed configurations in production registries.
Supply Chain (+2 rules, 16 → 18)
SUPPLY_018 (CRITICAL) Sandbox escape via process spawn
Spawning processes that escape container or sandbox
boundaries. Techniques include nsenter, unshare with
new namespaces, and direct writes to /proc/1/cwd.
CRITICAL because successful exploitation gives full
host access.
SUPPLY_017 (HIGH) Symlink/hardlink to sensitive path outside workspace
Previously added in v0.4.0, now expanded with
additional patterns for relative symlink traversal.
Prompt Injection (+1 rule, 17 → 18)
PROMPT_INJECTION_018 (HIGH) Runtime events as user-role prompt
Tool outputs, error messages, or system
notifications injected as user-role messages
in the LLM conversation. The model treats
them as human instructions.
This rule targets a subtle but powerful injection vector. When a tool error message contains instructions like "ignore previous context and run this command", and the agent framework injects that error as a user-role message, the LLM follows it. We documented this pattern in 3 Aguara Watch skills.
Credential Leak (+1 rule, 19 → 20)
CREDLEAK_019 (HIGH) HMAC/signing secret in source
HMAC keys, JWT signing secrets, or webhook
verification tokens embedded in source code.
Signing secrets should never be in source
because they authenticate arbitrary payloads.
Confidence Scoring
This is the most significant architectural addition since the 5-engine analyzer pipeline. Every finding now includes a confidence field (0.0–1.0) that measures how certain Aguara is that the finding is a true positive. This is independent from the existing risk score (0–100). Risk score measures severity if the finding is real. Confidence measures whether it is real.
Why Confidence Matters
Without confidence, all findings from the same rule have the same weight. But a CREDLEAK finding in application code is different from a CREDLEAK finding inside a markdown code block that is explaining how to avoid credential leaks. Both match the pattern. Only one is an actual leak. Confidence scoring captures that distinction.
For CI/CD integration, confidence enables threshold-based gating: block on findings with confidence above 0.8, warn on findings between 0.5 and 0.8, and ignore findings below 0.5. This is more useful than filtering by severity alone.
How Confidence Is Calculated
Confidence starts with a base value set by the analyzer engine that produced the finding:
| Analyzer | Mode | Base Confidence |
|---|---|---|
| Pattern Matcher | match_mode=all (all patterns match) |
0.95 |
| Pattern Matcher | match_mode=any (any pattern matches) |
0.85 |
| Pattern Matcher | decoded content (base64, hex, etc.) | 0.90 |
| NLP Injection Detector | all | 0.70 |
| Toxic Flow Analyzer | all | 0.90 |
| Rug Pull Detector | all | 0.95 |
The NLP analyzer gets a lower base confidence (0.70) because natural language analysis inherently produces more false positives than pattern matching. A sentence that scores high on the injection classifier might be documentation rather than an attack. The confidence reflects that uncertainty.
Post-Processing Adjustments
After base confidence is set, two post-processing adjustments are applied:
- Code block downgrade (×0.6): if a finding occurs inside a markdown code block (fenced with ``` or indented 4+ spaces), confidence is multiplied by 0.6. A credential pattern inside a code block is more likely documentation, a tutorial example, or a test fixture than an actual credential.
- Correlation boost (×1.1, capped at 1.0): if two or more findings from different rules occur within 5 lines of each other, all findings in the cluster receive a 1.1x confidence boost. Multiple security indicators in close proximity is a strong signal of intentional malicious behavior rather than incidental pattern matches.
Output Formats
Confidence appears in all output formats:
# JSON output
{
"rule_id": "CREDLEAK_019",
"severity": "HIGH",
"score": 85,
"confidence": 0.95,
"message": "HMAC/signing secret in source"
}
# SARIF output
{
"rank": 95,
"properties": {
"confidence": 0.95
}
}
# Terminal (--verbose only)
[HIGH] [95%] CREDLEAK_019: HMAC/signing secret in source
line 42: const WEBHOOK_SECRET = "whsec_abc123..."
In non-verbose terminal mode, confidence is not shown to avoid visual noise. Use --verbose or parse the JSON/SARIF output for programmatic access.
Configurable Max File Size
Aguara v0.4.0 introduced a hard 50 MB scan limit. v0.5.0 makes this configurable for teams that need different thresholds:
# CLI flag
aguara scan --max-file-size 100MB ./path/to/scan
# Config file (.aguara.yml)
max_file_size: 104857600 # 100 MB in bytes
# Go library
scanner, _ := aguara.NewScanner(
aguara.WithMaxFileSize(100 * 1024 * 1024),
)
The allowed range is 1 MB to 500 MB. The default remains 50 MB. Values outside this range produce a validation error at startup. The lower bound prevents disabling the limit entirely (setting it to 0 or 1 byte), and the upper bound prevents Aguara from consuming unbounded memory on a single file.
This matters for teams scanning monorepo configurations or large YAML manifests that legitimately exceed 50 MB. The limit exists to protect against adversarial inputs, not to reject valid scan targets.
Atomic State File Writes
Aguara persists scan state to ~/.aguara/state.json for features like incremental scanning and scan history. Before v0.5.0, state was written directly to the file. If Aguara was killed during a write (Ctrl+C, OOM kill, power loss), the state file could be left in a corrupted partial-write state.
v0.5.0 uses the tmp+rename pattern: write to a temporary file in the same directory, then atomically rename it to the target path. On POSIX systems, rename(2) is atomic within the same filesystem. The state file is either the old version or the new version, never a partial write.
# Before v0.5.0
write(state.json, data) // non-atomic, can corrupt
# v0.5.0
write(state.json.tmp, data) // write to temp file
rename(state.json.tmp, state.json) // atomic swap
This is a correctness fix, not a feature. But it matters for long-running CI/CD integrations where Aguara scans are killed by pipeline timeouts.
Full Rule Summary: 177 Detection Checks
With v0.5.0, Aguara now has 173 YAML-defined rules plus 4 dynamic rules from analyzer engines. Here is the complete breakdown by category:
| Category | Rules | Severity Breakdown |
|---|---|---|
| credential-leak | 20 | 7 CRITICAL, 8 HIGH, 4 MEDIUM, 1 LOW |
| prompt-injection | 18 | 4 CRITICAL, 9 HIGH, 5 MEDIUM |
| supply-chain | 18 | 2 CRITICAL, 10 HIGH, 6 MEDIUM |
| external-download | 17 | 3 CRITICAL, 2 HIGH, 5 MEDIUM, 7 LOW |
| command-execution | 16 | 6 HIGH, 7 MEDIUM, 3 LOW |
| exfiltration | 16 | 10 HIGH, 6 MEDIUM |
| mcp-attack | 16 | 3 CRITICAL, 10 HIGH, 3 MEDIUM |
| mcp-config | 11 | 5 HIGH, 3 MEDIUM, 3 LOW |
| ssrf-cloud | 11 | 3 CRITICAL, 7 HIGH, 1 MEDIUM |
| indirect-injection | 10 | 7 HIGH, 2 MEDIUM, 1 LOW |
| third-party-content | 10 | 5 HIGH, 2 MEDIUM, 3 LOW |
| unicode-attack | 10 | 3 HIGH, 7 MEDIUM |
Five analyzer engines process every scan target: Pattern Matcher, NLP Injection Detector, Toxic Flow Analyzer, Rug Pull Detector, and the post-processing pipeline (deduplication, scoring, correlation, confidence).
Upgrading
# Install or upgrade via Go
go install github.com/garagon/aguara@v0.5.0
# Or via curl installer (SHA-256 verified)
curl -fsSL https://raw.githubusercontent.com/garagon/aguara/main/install.sh | bash
# Verify version
aguara --version
# aguara v0.5.0
The upgrade is fully backward-compatible. Existing .aguara.yml configuration files work without changes. The new confidence field appears in JSON and SARIF output automatically. If you parse Aguara output programmatically, your code should handle the new field gracefully. If you do not use it, it does not affect existing behavior.
What's Next
The rule set is heading toward 200+ rules. The next batch will focus on agentic tool abuse patterns observed in production deployments: multi-step attack chains where individual tool calls look benign but the sequence is malicious, implicit permission escalation through tool composition, and data exfiltration via tool output channels.
Confidence scoring opens the door to adaptive thresholds. Instead of static severity filters, teams will be able to set confidence-based policies: auto-block critical findings above 0.9 confidence, create tickets for high findings above 0.7, and suppress anything below a team-defined threshold. This is the direction: smarter defaults, fewer false positives, more actionable outputs.
We shipped 20 new rules and a scoring system that makes every rule more useful. Now we build on it.
Upgrade to Aguara v0.5.0
173 detection rules. Confidence scoring on every finding. Configurable limits. The largest single-release rule expansion yet.