Files
guarddog-nexus/.opencode/plans/security-audit-guarddog-nexus.md
2026-05-11 20:04:16 +03:00

286 lines
8.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Security Audit Report — GuardDog Nexus
**Date:** 2026-05-10
**Auditor:** Automated security audit
**Last updated:** 2026-05-11
**Scope:** Full codebase review — security vulnerabilities, logic errors, missing controls
---
## Summary
| Severity | Count | Fixed | Rejected | Remaining |
|----------|-------|-------|----------|-----------|
| CRITICAL | 5 | 2 | 2 | 1 |
| HIGH | 7 | 2 | 3 | 2 |
| MEDIUM | 8 | 3 | 0 | 5 |
| LOW | 6 | 2 | 0 | 4 |
| **Total**| **26**| **9** | **5** | **12** |
---
## CRITICAL (5)
### C1. SSRF via webhook downloadUrl ✅ FIXED
**Severity:** CRITICAL
**Fix:** `NEXUS_ALLOWED_HOSTS` env var + `_validate_download_url()` in `core/nexus.py`.
**Problem:** `downloadUrl` из webhook-пэйлода передаётся напрямую в `httpx.AsyncClient.get()` без валидации.
```python
download_url = asset.get("downloadUrl") or _build_download_url(repository, asset_path)
# ...
response = await client.get(download_url) # no validation
```
**Real-world impact:** Атакующий отправляет webhook с `downloadUrl: "http://169.254.169.254/latest/meta-data/iam/security-credentials/"` → сервер скачивает IAM-учётные данные облака.
**Fix:** Validate URL scheme (http/https only), block private IP ranges (10.x, 172.16.x, 192.168.x, 127.x, 169.254.x, ::1), optionally whitelist domain against `config.nexus_url`.
---
### C2. Webhook secret not enforced by default ❌ ACCEPTED RISK
**Severity:** CRITICAL
**Decision:** Внутренний сервис, секрет опционален.
---
### C3. Default admin credentials ✅ FIXED
**Severity:** CRITICAL
**Fix:** Убран BasicAuth из всех запросов к Nexus (анонимный доступ).
---
### C4. XSS via LLM report verdict ❌ NOT DANGEROUS
**Severity:** CRITICAL — downgraded to INFO
**Decision:** Jinja2 autoescape блокирует инъекцию в атрибутах.
---
### C5. LLM Prompt Injection ⚠️ PARTIALLY MITIGATED
**Severity:** CRITICAL
**Mitigation:** System prompt gives priority to system instructions. Raw finding data still in user message.
---
## HIGH (7)
### H1. No rate limiting ❌ REJECTED
### H2. Path traversal ⚠️ LOW RISK
**Severity:** HIGH — downgraded
**Analysis:** `os.path.basename("../../../etc/passwd")``"passwd"`, traversal невозможен.
---
### H3. Sensitive data in API ❌ REJECTED (source_ip is a feature)
### H4. No authentication ❌ REJECTED (internal service)
### H5. Memory leak in locks ✅ FIXED (bg cleanup every 30min)
### H6. Race condition in URL locking ✅ FIXED (DB re-check inside lock)
### H7. CSV export bounded ❌ REJECTED (acceptable for internal tool)
---
## MEDIUM (8)
### M1. No LLM response schema validation
**Severity:** MEDIUM
**File:** `core/llm.py:80-82`
**Problem:** LLM response parsed as JSON but not validated against schema. Missing `report.verdict` → Jinja2 renders empty string → CSS broken.
**Fix:** Pydantic model для валидации LLM response.
---
### M2. No CSRF protection
**Severity:** MEDIUM
**File:** `routes/web.py:205-274`
**Problem:** POST `/api/v1/findings/{id}/analyze` без CSRF token.
**Fix:** Добавить CSRF token для всех POST endpoints.
---
### M3. No security headers
**Severity:** MEDIUM
**File:** `main.py`
**Problem:** Отсутствие CSP, X-Content-Type-Options, X-Frame-Options, X-XSS-Protection.
**Fix:** Middleware для security headers.
---
### M4. SQLite without WAL mode
**Severity:** MEDIUM
**File:** `db/engine.py:12`
**Problem:** Concurrent readers block writers → poor performance under load.
**Fix:** `PRAGMA journal_mode=WAL` in connection setup.
---
### M5. Scoped npm packages not supported
**Severity:** MEDIUM
**File:** `core/nexus.py:54-70`
**Problem:** `extract_npm_info` returns `None` для `@scope/package` → пропускаются сканирования.
**Fix:** Обновить extractor для scoped packages.
---
### M6. Dashboard stats — potential IndexError
**Severity:** MEDIUM
**File:** `routes/api_scans.py:145-147`
**Problem:** `dashboard["latest_flagged"][0]` — IndexError если `latest_flagged` пустой.
```python
"latest_scan_at": dashboard["latest_flagged"][0].started_at.isoformat()
```
**Fix:** Guard с `if dashboard.get("latest_flagged")`.
---
### M7. Error message HTML escaping
**Severity:** MEDIUM
**File:** `web/templates/scan_detail.html:30`
**Problem:** `scan.error_message` rendered в template — если содержит HTML/JS, может сломать UI.
**Fix:** Jinja2 autoescape handles this, но стоит добавить explicit escaping для `code` fields.
---
### M8. Unknown ecosystem defaults to pypi ✅ FIXED
**Severity:** MEDIUM
**Fix:** `_detect_ecosystem()` возвращает `None` → webhook reject с `"unknown_ecosystem"`.
**Duplicate:** L6.
---
## LOW (6)
### L1. Dockerfile grep hack ✅ FIXED (`uv pip install . --system`)
### L2. Health check without DB ✅ FIXED (`/health/dependencies`)
---
### L3. No backup strategy for SQLite
**Severity:** LOW
**Risk:** Crash → corrupted database → data loss.
**Fix:** Регулярные backups через cron или switch to PostgreSQL for production.
---
### L4. Dead code — `parse_package_path` unused in harvester
**Severity:** LOW
**File:** `core/nexus.py:93-99`
**Problem:** Функция определена но не используется в harvester pipeline.
**Fix:** Убрать или интегрировать.
---
### L5. Hardcoded LLM API base URL
**Severity:** LOW
**File:** `constants.py:139`
**Problem:** Default `https://api.openai.com/v1` — unexpected API calls для пользователей локальных моделей.
**Fix:** Better default или warning at startup.
---
### L6. Unknown ecosystem defaults to pypi (webhook)
**Severity:** LOW
**File:** `routes/webhooks.py:62`
**Problem:** Неизвестный format → fallback к pypi. Maven/NuGet webhooks будут сканироваться как PyPI пакеты.
**Fix:** Явно reject неизвестные ecosystems.
---
## Implementation Plan
### Phase 1 — P0 (Critical)
| # | Task | Status |
|---|------|--------|
| 1 | SSRF protection | ✅ FIXED |
| 2 | Mandatory WEBHOOK_SECRET | ❌ ACCEPTED |
| 3 | Remove default Nexus credentials | ✅ FIXED |
| 4 | LLM verdict whitelist + prompt injection | ⚠️ PARTIAL |
| 5 | Path traversal fix | ⚠️ LOW RISK |
### Phase 2 — P1 (High)
| # | Task | Status |
|---|------|--------|
| 6 | Rate limiting | ❌ REJECTED |
| 7 | API authentication | ❌ REJECTED |
| 8 | Memory leak fix for locks | ✅ FIXED |
| 9 | Race condition fix | ✅ FIXED |
| 10 | Remove source_ip from public API | ❌ REJECTED |
| 11 | CSV export auth + limit | ❌ REJECTED |
### Phase 3 — P2 (Medium)
| # | Task | Status |
|---|------|--------|
| 12 | LLM response validation (Pydantic) | ⬜ |
| 13 | CSRF protection | ⬜ |
| 14 | Security headers middleware | ⬜ |
| 15 | SQLite WAL mode | ⬜ |
| 16 | Scoped npm support | ⬜ |
| 17 | Dashboard None guard | ⬜ |
| 18 | `serialize_finding` вместо `**f.data` | ✅ FIXED |
| 19 | `_scan_component` try/except | ✅ FIXED |
| 20 | Reject unknown ecosystem | ✅ FIXED |
### Phase 4 — P3 (Low)
| # | Task | Status |
|---|------|--------|
| 21 | Dockerfile deps | ✅ FIXED |
| 22 | Health check DB ping | ✅ FIXED |
| 23 | Backup strategy docs | ⬜ |
| 24 | Reject unknown ecosystems | ✅ FIXED (duplicate) | |
---
## Test Coverage Gaps
The existing 85 tests do NOT cover:
- [ ] SSRF prevention (malicious downloadUrl)
- [ ] Webhook signature validation with empty secret
- [ ] Path traversal in download URLs
- [ ] Rate limiting on webhook endpoint
- [ ] Authentication on API endpoints
- [ ] LLM prompt injection
- [ ] LLM response schema validation
- [ ] CSRF protection
- [ ] Security headers presence
- [ ] Memory leak in lock dictionaries
- [ ] Race condition in URL locking
- [ ] Scoped npm package extraction
- [ ] Dashboard IndexError on empty data
---
## Recommendations
1. **Immediate:** Implement C1-C5 before any production deployment
2. **Short-term:** Implement H1-H7 within first sprint
3. **Medium-term:** Implement M1-M8 within first month
4. **Long-term:** Implement L1-L6 during routine maintenance
5. **Ongoing:** Add security-focused tests for all findings above