10 KiB
Security Audit Report — GuardDog Nexus
Date: 2026-05-10
Auditor: Automated security audit
Last updated: 2026-05-11 (consolidated with improvements/final-plan; statuses verified against current codebase)
Scope: Full codebase review — security vulnerabilities, logic errors, missing controls
Summary
| Severity | Count | Fixed | Rejected/Accepted | Mitigated | Open |
|---|---|---|---|---|---|
| CRITICAL | 5 | 2 | 2 | 1 | 0 |
| HIGH | 7 | 2 | 5 | 0 | 0 |
| MEDIUM | 8 | 6 | 0 | 0 | 2 |
| LOW | 6 | 4 | 0 | 0 | 2 |
| Total | 26 | 14 | 7 | 1 | 4 |
14 fixed, 7 closed as rejected/accepted-risk, 1 partially mitigated, 4 remaining open.
CRITICAL (5)
C1. SSRF via webhook downloadUrl ✅ FIXED
Severity: CRITICAL
Fix: NEXUS_ALLOWED_HOSTS env var + _validate_download_url() in core/nexus.py.
Problem: downloadUrl from webhook payload was passed directly to httpx.AsyncClient.get() without validation.
Fix: Validate URL scheme (http/https only), validate hostname against allowed hosts list. Defaults to Nexus hostname if NEXUS_ALLOWED_HOSTS not set.
C2. Webhook secret not enforced by default ❌ ACCEPTED RISK
Severity: CRITICAL
Decision: Internal service; secret is optional. Documented as such.
C3. Default admin credentials ✅ FIXED
Severity: CRITICAL
Fix: Removed BasicAuth from all Nexus API calls (anonymous access).
C4. XSS via LLM report verdict ❌ NOT DANGEROUS
Severity: CRITICAL — downgraded to INFO
Decision: Jinja2 autoescape blocks injection in attributes.
C5. LLM Prompt Injection ⚠️ PARTIALLY MITIGATED
Severity: CRITICAL
Mitigation: System prompt gives priority to system instructions. _validate_report() applies defaults for missing/invalid fields. Raw finding data still in user message.
HIGH (7)
H1. No rate limiting ❌ REJECTED
Severity: HIGH
Decision: Internal service; not exposed to public internet.
H2. Path traversal via download filename ⚠️ LOW RISK
Severity: HIGH — downgraded
Analysis: os.path.basename("../../../etc/passwd") → "passwd", traversal impossible. Risk accepted.
H3. Sensitive data in API (source_ip) ❌ REJECTED
Severity: HIGH
Decision: source_ip and initiator are features. Internal service — acceptable.
H4. No authentication on API ❌ REJECTED
Severity: HIGH
Decision: Internal service; not exposed to public internet.
H5. Memory leak in lock dictionaries ✅ FIXED
Severity: HIGH
Fix: Background cleanup tasks every 30 minutes in main.py for both _url_locks (harvester) and _llm_locks (web). Tasks spawned via asyncio.create_task() in lifespan, gracefully cancelled on shutdown.
H6. Race condition in URL locking ✅ FIXED
Severity: HIGH
Fix: DB re-check (sha256 dedup) happens inside the URL lock critical section, preventing parallel scans of the same asset.
H7. CSV export unbounded ❌ REJECTED
Severity: HIGH
Decision: Acceptable for internal tool. Not exposed to public.
MEDIUM (8)
M1. No LLM response schema validation ✅ FIXED
Severity: MEDIUM
File: core/llm.py:25-28
Fix: _validate_report() applies defaults for missing fields:
verdict→"unknown"summary→"No summary provided"analysis→"No analysis provided"severity_rating→"unknown"- Also unwraps JSON from markdown code fences (
json ...).
M2. No CSRF protection ⬜ OPEN
Severity: MEDIUM
File: routes/web.py:205-274
Problem: POST /api/v1/findings/{id}/analyze has no CSRF token. While the service is internal, a CSRF attack from the same origin could trigger unwanted LLM analysis.
Suggested fix: Add a CSRF middleware or token check for state-changing POST endpoints.
M3. No security headers ✅ FIXED
Severity: MEDIUM
File: main.py:95-113
Fix: SecurityHeadersMiddleware sets on all responses:
X-Content-Type-Options: nosniffX-Frame-Options: DENYX-XSS-Protection: 1; mode=blockReferrer-Policy: strict-origin-when-cross-originPermissions-Policy: geolocation=(), microphone=()
M4. SQLite without WAL mode ⬜ OPEN
Severity: MEDIUM
File: db/engine.py:12
Problem: No PRAGMA journal_mode=WAL — concurrent readers block writers, causing degraded performance under load.
Suggested fix: Add WAL mode in connection setup:
async with _engine.connect() as conn:
await conn.execute(text("PRAGMA journal_mode=WAL"))
M5. Scoped npm packages not supported ✅ FIXED
Severity: MEDIUM
File: core/nexus.py:75-80
Fix: extract_npm_info handles @scope/name scoped packages:
if parts[1].startswith("@"):
name = f"{parts[1]}/{parts[2]}"
short_name = parts[2]
M6. Dashboard stats — potential IndexError ✅ FIXED
Severity: MEDIUM
File: routes/api_scans.py:146-148
Fix: Guard checks latest is non-empty and has started_at:
latest[0].started_at.isoformat() if latest and latest[0].started_at else None
M7. Error message HTML escaping ✅ FIXED
Severity: MEDIUM
File: web/templates/scan_detail.html:30
Fix: Jinja2 autoescape handles HTML in scan.error_message. No additional escaping required.
M8. Unknown ecosystem defaults to pypi ✅ FIXED
Severity: MEDIUM
File: routes/webhooks.py:58-69
Fix: _detect_ecosystem() returns None for unknown formats; webhook handler rejects with "unknown_ecosystem" error.
LOW (6)
L1. Dockerfile grep hack ✅ FIXED
Severity: LOW
Fix: Replaced with uv pip install . --system.
L2. Health check without DB ✅ FIXED
Severity: LOW
File: main.py:139-140
Fix: /health/dependencies endpoint checks database connectivity and Nexus API reachability.
L3. No backup strategy for SQLite ⬜ OPEN
Severity: LOW
Risk: Crash → corrupted database → data loss.
Suggested fix: Add documentation for regular backups via cron or a backup script. Consider PostgreSQL for production deployments.
L4. Dead code — parse_package_path ✅ FIXED
Severity: LOW
File: core/nexus.py:113
Resolution: Function is actively used in routes/web.py and routes/api_packages.py. Not dead code.
L5. Hardcoded LLM API base URL ⬜ OPEN
Severity: LOW
File: constants.py:140
Problem: LLM_DEFAULT_API_BASE = "https://api.openai.com/v1" — unexpected API calls for users of local models who forget to set LLM_API_BASE.
Suggested fix: Either log a warning at startup or change default to an empty/required value.
L6. Unknown ecosystem defaults to pypi (webhook) ✅ FIXED
Severity: LOW
File: routes/webhooks.py:62
Fix: Same as M8. _detect_ecosystem() returns None for unknown formats; webhook rejects.
Implementation Plan
Phase 1 — P0 (Critical) — COMPLETED
| # | Task | Status |
|---|---|---|
| 1 | SSRF protection | ✅ FIXED |
| 2 | Mandatory WEBHOOK_SECRET | ❌ ACCEPTED |
| 3 | Remove default Nexus credentials | ✅ FIXED |
| 4 | LLM verdict whitelist + prompt injection | ⚠️ PARTIAL |
| 5 | Path traversal fix | ⚠️ LOW RISK |
Phase 2 — P1 (High) — COMPLETED
| # | Task | Status |
|---|---|---|
| 6 | Rate limiting | ❌ REJECTED |
| 7 | API authentication | ❌ REJECTED |
| 8 | Memory leak fix for locks | ✅ FIXED |
| 9 | Race condition fix | ✅ FIXED |
| 10 | Remove source_ip from public API | ❌ REJECTED |
| 11 | CSV export auth + limit | ❌ REJECTED |
Phase 3 — P2 (Medium)
| # | Task | Status |
|---|---|---|
| 12 | LLM response validation (Pydantic/defaults) | ✅ FIXED |
| 13 | CSRF protection | ⬜ OPEN |
| 14 | Security headers middleware | ✅ FIXED |
| 15 | SQLite WAL mode | ⬜ OPEN |
| 16 | Scoped npm support | ✅ FIXED |
| 17 | Dashboard None guard | ✅ FIXED |
| 18 | Reject unknown ecosystem | ✅ FIXED |
Phase 4 — P3 (Low)
| # | Task | Status |
|---|---|---|
| 19 | Dockerfile deps | ✅ FIXED |
| 20 | Health check DB ping | ✅ FIXED |
| 21 | Backup strategy docs | ⬜ OPEN |
| 22 | Hardcoded LLM API base URL | ⬜ OPEN |
Remaining Open Items (4)
| # | Severity | Finding | Recommendation |
|---|---|---|---|
| M2 | MEDIUM | No CSRF protection on POST endpoints | Add CSRF middleware or token validation |
| M4 | MEDIUM | SQLite without WAL mode | Add PRAGMA journal_mode=WAL in engine setup |
| L3 | LOW | No backup strategy for SQLite | Document backup procedures or switch to PostgreSQL |
| L5 | LOW | Hardcoded LLM default API base URL | Log warning on startup or require explicit configuration |
Test Coverage Gaps
The existing 137 tests (101 unit + 36 e2e) do NOT cover:
- SSRF prevention (malicious downloadUrl)
- Webhook signature validation with empty secret
- Path traversal in download URLs
- Rate limiting on webhook endpoint
- Authentication on API endpoints
- LLM prompt injection
- CSRF protection (M2 — open)
- Security headers presence
- SQLite WAL mode behavior
Recommendations
- Immediate: No critical items remain open. C1, C3 are fixed; C2, C4 are accepted.
- Short-term: Address M2 (CSRF) and M4 (WAL mode) — both are straightforward, low-risk fixes.
- Long-term: Address L3 (backup strategy) and L5 (LLM default URL) during routine maintenance.
- Ongoing: Add security-focused tests for resolved findings to prevent regressions.
Notes
- Consolidation: This document supersedes
improvements.mdandfinal-plan.md(deleted). All verified fixes from those plans are incorporated. - LLM retry: Implemented with exponential backoff (2s, 4s, 8s, max 3 attempts) in
core/llm.py:126-152. - Lock cleanup: Background tasks in
main.py:59-75clean up_url_locksand_llm_locksevery 30 minutes. - Race condition: SHA256 dedup check runs inside URL lock critical section in harvester.
- Scoped npm:
extract_npm_infoincore/nexus.py:75-80handles@scope/namepackages. - Dashboard guard:
routes/api_scans.py:147checkslatest and latest[0].started_atbefore access.