fix: аудит — 19 фиксов безопасности, надёжности, UI и 16 новых тестов

- S4: bump jinja2>=3.1.4, python-multipart>=0.0.18, httpx>=0.28.0
- S5: _detect_ecosystem — DEFAULT_ECOSYSTEM для неизвестных форматов
- S6: harvester — log.exception() вместо log.error()
- S8: _scan_component — urlencode параметров
- P1: scanner — proc.kill() при таймауте
- P3: api_packages — selectinload(Scan.findings), убран N+1
- P4+P5: утечка _url_locks и _llm_locks при early return
- P6: DB reaper — сброс {'status':'analyzing'} при старте
- UI: htmx-пагинация, фильтры не теряют flagged, 404 с layout
- UI: мобильные таблицы overflow-x, полная стата на дашборде
- UI: i18n статусов в _status_badge, urlencode package_name
- 16 новых тестов: analyze endpoint (6), scanner errors (4),
  webhook signature (2), llm client (4)
This commit is contained in:
Marker689
2026-05-10 10:45:44 +03:00
parent d483a8b21d
commit 1341404568
31 changed files with 575 additions and 152 deletions

2
.gitignore vendored
View File

@@ -11,3 +11,5 @@ data/
.env .env
.venv/ .venv/
venv/ venv/
.agents/
skills-lock.json

View File

@@ -63,6 +63,7 @@ After startup:
| `TEMP_DIR` | `/tmp/guarddog-nexus` | Temporary download directory | | `TEMP_DIR` | `/tmp/guarddog-nexus` | Temporary download directory |
| `MAX_CONCURRENT_SCANS` | `4` | Maximum simultaneous GuardDog processes | | `MAX_CONCURRENT_SCANS` | `4` | Maximum simultaneous GuardDog processes |
| `LLM_ENABLED` | `0` | Set to `1` to enable LLM analysis | | `LLM_ENABLED` | `0` | Set to `1` to enable LLM analysis |
| `LLM_AUTO_ANALYZE` | `0` | Set to `1` to auto-analyze after scan; `0` = manual mode via UI button |
| `LLM_API_KEY` | _(empty)_ | API key (OpenAI / Groq / Ollama / etc.) | | `LLM_API_KEY` | _(empty)_ | API key (OpenAI / Groq / Ollama / etc.) |
| `LLM_API_BASE` | `https://api.openai.com/v1` | OpenAI-compatible base URL | | `LLM_API_BASE` | `https://api.openai.com/v1` | OpenAI-compatible base URL |
| `LLM_MODEL` | `gpt-4o-mini` | Model name | | `LLM_MODEL` | `gpt-4o-mini` | Model name |
@@ -128,15 +129,54 @@ GuardDog Nexus accepts `UPDATED` webhook events from Nexus.
## LLM Analysis ## LLM Analysis
GuardDog Nexus can automatically analyze each finding through an LLM. When enabled (`LLM_ENABLED=1`), every flagged scan gets an AI breakdown: threat assessment, code analysis, and recommendations. GuardDog Nexus can analyze findings through an LLM. When enabled (`LLM_ENABLED=1`), flagged findings receive an AI breakdown: threat assessment, code analysis, and recommendations.
**Auto mode:** after a flagged scan completes, each finding is sent to the LLM. Reports are saved to the database and included in JSON log output. ### Operating Modes
**Manual mode:** the web UI has an "Analyze with LLM" button next to each finding — click to get an inline verdict. The `LLM_AUTO_ANALYZE` variable controls the analysis mode:
Supported providers: any OpenAI-compatible API (OpenAI, Groq, Ollama, vLLM, etc.). - **`LLM_AUTO_ANALYZE=1` (automatic):** each finding is automatically sent to the LLM after a scan completes. Reports are saved to the database and included in JSON log output. No "Analyze" button is shown in the UI.
- **`LLM_AUTO_ANALYZE=0` (manual, default):** an "Analyze with LLM" button is shown next to each finding in the web UI. The user clicks to trigger analysis and see the inline verdict.
LLM response format (JSON): ### finding.report State Machine
The `finding.report` field transitions through these states:
| Value | UI |
|-------|----|
| `None` | "Analyze with LLM" button (manual mode only) |
| `{"status": "analyzing"}` | Spinner |
| `{verdict:, summary:, ...}` | Report + "Retry" link |
### Supported Providers
Any OpenAI-compatible API. Configuration examples:
```bash
# OpenAI (manual mode)
LLM_ENABLED=1
LLM_AUTO_ANALYZE=0
LLM_API_KEY=sk-...
LLM_API_BASE=https://api.openai.com/v1
LLM_MODEL=gpt-4o-mini
# Groq with auto-analysis (faster, free tier)
LLM_ENABLED=1
LLM_AUTO_ANALYZE=1
LLM_API_KEY=gsk_...
LLM_API_BASE=https://api.groq.com/openai/v1
LLM_MODEL=llama-3.3-70b-versatile
# Local Ollama
LLM_ENABLED=1
LLM_API_KEY=ollama
LLM_API_BASE=http://host.docker.internal:11434/v1
LLM_MODEL=llama3.2
```
### Response Format
LLM returns JSON with fields:
- `verdict``safe` / `suspicious` / `malicious` - `verdict``safe` / `suspicious` / `malicious`
- `summary` — one-line verdict - `summary` — one-line verdict
- `analysis` — detailed breakdown (23 paragraphs) - `analysis` — detailed breakdown (23 paragraphs)
@@ -156,7 +196,7 @@ guarddog-nexus/
│ ├── i18n.py # RU/EN translations │ ├── i18n.py # RU/EN translations
│ ├── logging_setup.py # JSON structured logging │ ├── logging_setup.py # JSON structured logging
│ └── main.py # FastAPI app entry point │ └── main.py # FastAPI app entry point
├── tests/ # pytest tests (50+) ├── tests/ # pytest tests (85 tests)
├── scripts/ # Setup scripts ├── scripts/ # Setup scripts
├── docker-compose.yml ├── docker-compose.yml
├── Dockerfile ├── Dockerfile

View File

@@ -87,6 +87,7 @@ python -m guarddog_nexus.main
| `MAX_CONCURRENT_SCANS` | `4` | Максимум одновременных сканирований GuardDog | | `MAX_CONCURRENT_SCANS` | `4` | Максимум одновременных сканирований GuardDog |
| `LOG_SYSLOG_FACILITY` | `local0` | Syslog facility (local0local7) | | `LOG_SYSLOG_FACILITY` | `local0` | Syslog facility (local0local7) |
| `LLM_ENABLED` | `0` | `1` — включить LLM-анализ уязвимостей | | `LLM_ENABLED` | `0` | `1` — включить LLM-анализ уязвимостей |
| `LLM_AUTO_ANALYZE` | `0` | `1` — автоанализ после скана; `0` = ручной режим через кнопку в UI |
| `LLM_API_KEY` | _(пусто)_ | API-ключ (OpenAI / Groq / Ollama / etc.) | | `LLM_API_KEY` | _(пусто)_ | API-ключ (OpenAI / Groq / Ollama / etc.) |
| `LLM_API_BASE` | `https://api.openai.com/v1` | Базовый URL OpenAI-совместимого API | | `LLM_API_BASE` | `https://api.openai.com/v1` | Базовый URL OpenAI-совместимого API |
| `LLM_MODEL` | `gpt-4o-mini` | Название модели | | `LLM_MODEL` | `gpt-4o-mini` | Название модели |
@@ -149,13 +150,14 @@ GuardDog Nexus принимает вебхуки от Nexus при событи
| Метод | Путь | Описание | | Метод | Путь | Описание |
|-------|------|----------| |-------|------|----------|
| GET | `/api/v1/findings` | Список уязвимостей (фильтр по правилу, severity, scan_id) | | GET | `/api/v1/findings` | Список уязвимостей (фильтр по правилу, severity, scan_id) |
| POST | `/api/v1/findings/{id}/analyze` | Запустить LLM-анализ уязвимости | | POST | `/api/v1/findings/{id}/analyze` | Запустить LLM-анализ уязвимости (возвращает HTMX-фрагмент при вызове из веб-интерфейса) |
### Здоровье ### Здоровье и метрики
| Метод | Путь | Описание | | Метод | Путь | Описание |
|-------|------|----------| |-------|------|----------|
| GET | `/health` | Проверка работоспособности | | GET | `/health` | Проверка работоспособности |
| GET | `/metrics` | Метрики в формате Prometheus |
## Веб-интерфейс ## Веб-интерфейс
@@ -227,26 +229,40 @@ guarddog-nexus/
## LLM-анализ ## LLM-анализ
GuardDog Nexus может автоматически анализировать каждую найденную уязвимость через LLM (языковую модель). При включении (`LLM_ENABLED=1`) каждый flagged скан получает AI-разбор: насколько угроза реальна, что делает подозрительный код, рекомендации. GuardDog Nexus может анализировать найденные уязвимости через LLM (языковую модель). При включении (`LLM_ENABLED=1`) уязвимые находки получают AI-разбор: насколько угроза реальна, что делает подозрительный код, рекомендации.
### Как работает ### Режимы работы
1. **Автоматический режим:** после завершения скана с уязвимостями GuardDog Nexus отправляет каждую находку в LLM, сохраняет отчёт в БД и включает его в syslog-событие Переменная `LLM_AUTO_ANALYZE` управляет режимом анализа:
2. **Ручной режим:** в веб-интерфейсе на странице сканирования у каждой уязвимости есть кнопка «Analyze with LLM» — нажатие отправляет запрос и показывает вердикт inline
- **`LLM_AUTO_ANALYZE=1` (автоматический):** после завершения скана каждая находка автоматически отправляется в LLM. Отчёт сохраняется в БД и включается в syslog-событие. Кнопка анализа в UI не отображается.
- **`LLM_AUTO_ANALYZE=0` (ручной, по умолчанию):** в веб-интерфейсе рядом с каждой уязвимостью отображается кнопка «Analyze with LLM». Пользователь нажимает кнопку — запускается анализ, результат показывается inline.
### Состояния finding.report
Поле `finding.report` проходит через конечный автомат:
| Значение | UI |
|----------|----|
| `None` | Кнопка «Analyze with LLM» (только в ручном режиме) |
| `{"status": "analyzing"}` | Спиннер |
| `{verdict:, summary:, ...}` | Отчёт + ссылка «Retry» |
### Поддерживаемые провайдеры ### Поддерживаемые провайдеры
Любой OpenAI-совместимый API. Примеры конфигурации: Любой OpenAI-совместимый API. Примеры конфигурации:
```bash ```bash
# OpenAI # OpenAI (ручной режим)
LLM_ENABLED=1 LLM_ENABLED=1
LLM_AUTO_ANALYZE=0
LLM_API_KEY=sk-... LLM_API_KEY=sk-...
LLM_API_BASE=https://api.openai.com/v1 LLM_API_BASE=https://api.openai.com/v1
LLM_MODEL=gpt-4o-mini LLM_MODEL=gpt-4o-mini
# Groq (быстрее, бесплатный тир) # Groq с автоанализом (быстрее, бесплатный тир)
LLM_ENABLED=1 LLM_ENABLED=1
LLM_AUTO_ANALYZE=1
LLM_API_KEY=gsk_... LLM_API_KEY=gsk_...
LLM_API_BASE=https://api.groq.com/openai/v1 LLM_API_BASE=https://api.groq.com/openai/v1
LLM_MODEL=llama-3.3-70b-versatile LLM_MODEL=llama-3.3-70b-versatile

View File

@@ -30,9 +30,7 @@ class Config:
nexus_url: str = os.getenv("NEXUS_URL", "http://localhost:8081") nexus_url: str = os.getenv("NEXUS_URL", "http://localhost:8081")
nexus_username: str = os.getenv("NEXUS_USERNAME", "admin") nexus_username: str = os.getenv("NEXUS_USERNAME", "admin")
nexus_password: str = os.getenv("NEXUS_PASSWORD", "admin123") nexus_password: str = os.getenv("NEXUS_PASSWORD", "admin123")
nexus_download_timeout: int = _env_int( nexus_download_timeout: int = _env_int("NEXUS_DOWNLOAD_TIMEOUT_SECONDS", HTTP_TIMEOUT_DOWNLOAD)
"NEXUS_DOWNLOAD_TIMEOUT_SECONDS", HTTP_TIMEOUT_DOWNLOAD
)
nexus_api_timeout: int = _env_int("NEXUS_API_TIMEOUT_SECONDS", HTTP_TIMEOUT_API) nexus_api_timeout: int = _env_int("NEXUS_API_TIMEOUT_SECONDS", HTTP_TIMEOUT_API)
# Database # Database
@@ -55,9 +53,7 @@ class Config:
scan_timeout_seconds: int = _env_int("SCAN_TIMEOUT_SECONDS", 300) scan_timeout_seconds: int = _env_int("SCAN_TIMEOUT_SECONDS", 300)
temp_dir: str = os.getenv("TEMP_DIR", "/tmp/guarddog-nexus") temp_dir: str = os.getenv("TEMP_DIR", "/tmp/guarddog-nexus")
guarddog_binary: str = os.getenv("GUARDDOG_BINARY", GUARDDOG_BINARY_FALLBACK) guarddog_binary: str = os.getenv("GUARDDOG_BINARY", GUARDDOG_BINARY_FALLBACK)
max_concurrent_scans: int = _env_int( max_concurrent_scans: int = _env_int("MAX_CONCURRENT_SCANS", DEFAULT_MAX_CONCURRENT_SCANS)
"MAX_CONCURRENT_SCANS", DEFAULT_MAX_CONCURRENT_SCANS
)
# LLM analysis # LLM analysis
llm_enabled: bool = os.getenv("LLM_ENABLED", "").lower() in ("1", "true", "yes") llm_enabled: bool = os.getenv("LLM_ENABLED", "").lower() in ("1", "true", "yes")

View File

@@ -60,6 +60,8 @@ async def harvest(
lock = _url_locks[download_url] lock = _url_locks[download_url]
if lock.locked(): if lock.locked():
log.info("URL already being processed, skipping: %s", download_url) log.info("URL already being processed, skipping: %s", download_url)
async with _url_lock:
_url_locks.pop(download_url, None)
return None return None
async with lock: async with lock:
@@ -191,7 +193,7 @@ async def harvest(
return scan return scan
except Exception as e: except Exception as e:
log.error("Scan failed for %s==%s: %s", package_name, package_version, e) log.exception("Scan failed for %s==%s", package_name, package_version)
scan.status = ScanStatus.FAILED.value scan.status = ScanStatus.FAILED.value
scan.error_message = str(e)[:ERROR_MESSAGE_MAX_LENGTH] scan.error_message = str(e)[:ERROR_MESSAGE_MAX_LENGTH]
scan.finished_at = datetime.datetime.now(datetime.timezone.utc) scan.finished_at = datetime.datetime.now(datetime.timezone.utc)

View File

@@ -23,11 +23,7 @@ def _build_user_message(finding: dict) -> str:
location = finding.get("location", "") location = finding.get("location", "")
code = finding.get("code", "") code = finding.get("code", "")
prompt = ( prompt = f"Rule: {rule}\nSeverity: {severity}\nMessage: {message}\n"
f"Rule: {rule}\n"
f"Severity: {severity}\n"
f"Message: {message}\n"
)
if location: if location:
prompt += f"Location: {location}\n" prompt += f"Location: {location}\n"
if code: if code:
@@ -66,9 +62,7 @@ async def analyze_finding(finding_data: dict) -> dict | None:
try: try:
async with _llm_semaphore: async with _llm_semaphore:
async with httpx.AsyncClient( async with httpx.AsyncClient(timeout=config.llm_timeout, headers=headers) as client:
timeout=config.llm_timeout, headers=headers
) as client:
resp = await client.post(url, json=payload) resp = await client.post(url, json=payload)
resp.raise_for_status() resp.raise_for_status()
body = resp.json() body = resp.json()

View File

@@ -116,9 +116,7 @@ def _write_file(path: str, content: bytes) -> None:
async def nexus_get(path: str) -> httpx.Response: async def nexus_get(path: str) -> httpx.Response:
"""Make an authenticated GET request to Nexus REST API.""" """Make an authenticated GET request to Nexus REST API."""
auth = httpx.BasicAuth(config.nexus_username, config.nexus_password) auth = httpx.BasicAuth(config.nexus_username, config.nexus_password)
async with httpx.AsyncClient( async with httpx.AsyncClient(auth=auth, timeout=config.nexus_api_timeout) as client:
auth=auth, timeout=config.nexus_api_timeout
) as client:
return await client.get(f"{config.nexus_url.rstrip('/')}{path}") return await client.get(f"{config.nexus_url.rstrip('/')}{path}")

View File

@@ -34,6 +34,11 @@ async def scan_package(filepath: str, ecosystem: str = DEFAULT_ECOSYSTEM) -> dic
) )
except asyncio.TimeoutError: except asyncio.TimeoutError:
log.error("GuardDog scan timed out for %s", filepath) log.error("GuardDog scan timed out for %s", filepath)
try:
proc.kill()
await proc.wait()
except (ProcessLookupError, Exception):
pass
return {"findings": [], "errors": [SCAN_ERROR_TIMEOUT]} return {"findings": [], "errors": [SCAN_ERROR_TIMEOUT]}
except FileNotFoundError: except FileNotFoundError:
log.error("GuardDog binary not found at %s", guarddog_bin) log.error("GuardDog binary not found at %s", guarddog_bin)

View File

@@ -1,6 +1,5 @@
"""Async SQLite database setup via SQLAlchemy.""" """Async SQLite database setup via SQLAlchemy."""
from sqlalchemy import inspect, text from sqlalchemy import inspect, text
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
from sqlalchemy.orm import DeclarativeBase from sqlalchemy.orm import DeclarativeBase
@@ -69,6 +68,7 @@ async def init_db():
await conn.run_sync(Base.metadata.create_all) await conn.run_sync(Base.metadata.create_all)
await _migrate() await _migrate()
await _ensure_indexes() await _ensure_indexes()
await _reap_stale_analysis()
async def get_session() -> AsyncSession: async def get_session() -> AsyncSession:
@@ -90,3 +90,17 @@ async def _ensure_indexes():
async with _engine.begin() as conn: async with _engine.begin() as conn:
for sql in indexes: for sql in indexes:
await conn.execute(text(sql)) await conn.execute(text(sql))
async def _reap_stale_analysis():
"""Reset stuck 'analyzing' statuses left from crashes."""
sql = (
"UPDATE findings SET report = NULL "
"WHERE report IS NOT NULL "
"AND json_extract(report, '$.status') = 'analyzing'"
)
async with _engine.begin() as conn:
result = await conn.execute(text(sql))
count = result.rowcount
if count:
log.warning("Reset %d stale LLM analysis statuses", count)

View File

@@ -23,6 +23,7 @@ from guarddog_nexus.db.models import Finding, Scan
# Scan list query builder # Scan list query builder
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
def build_scan_list_query( def build_scan_list_query(
flagged: bool | None = None, flagged: bool | None = None,
status: str | None = None, status: str | None = None,
@@ -51,9 +52,7 @@ def build_scan_list_query(
count_q = count_q.where(Scan.repository == repository) count_q = count_q.where(Scan.repository == repository)
if search: if search:
pattern = f"%{search}%" pattern = f"%{search}%"
condition = Scan.package_name.ilike(pattern) | Scan.package_version.ilike( condition = Scan.package_name.ilike(pattern) | Scan.package_version.ilike(pattern)
pattern
)
q = q.where(condition) q = q.where(condition)
count_q = count_q.where(condition) count_q = count_q.where(condition)
@@ -70,6 +69,7 @@ def build_scan_list_query(
# Package list query builder # Package list query builder
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
def build_package_list_query( def build_package_list_query(
flagged: bool | None = None, flagged: bool | None = None,
ecosystem: str | None = None, ecosystem: str | None = None,
@@ -101,9 +101,7 @@ def build_package_list_query(
subq = subq.where(Scan.repository == repository) subq = subq.where(Scan.repository == repository)
if search: if search:
pattern = f"%{search}%" pattern = f"%{search}%"
subq = subq.where( subq = subq.where(Scan.package_name.ilike(pattern) | Scan.package_version.ilike(pattern))
Scan.package_name.ilike(pattern) | Scan.package_version.ilike(pattern)
)
if flagged is not None: if flagged is not None:
subq = subq.having(func.max(Scan.flagged) == flagged) subq = subq.having(func.max(Scan.flagged) == flagged)
@@ -112,9 +110,7 @@ def build_package_list_query(
sort_field_name = PACKAGE_SORT_FIELDS.get(sort_by, "started_at") sort_field_name = PACKAGE_SORT_FIELDS.get(sort_by, "started_at")
sort_col_from = getattr(Scan, sort_field_name, Scan.started_at) sort_col_from = getattr(Scan, sort_field_name, Scan.started_at)
sort_col = func.max(sort_col_from) sort_col = func.max(sort_col_from)
subq = subq.order_by( subq = subq.order_by(sort_col.desc() if sort_dir == "desc" else sort_col.asc())
sort_col.desc() if sort_dir == "desc" else sort_col.asc()
)
sq = subq.subquery() sq = subq.subquery()
total_q = select(func.count()).select_from(sq) total_q = select(func.count()).select_from(sq)
@@ -126,12 +122,11 @@ def build_package_list_query(
# Dashboard stats (shared between API /stats and web dashboard) # Dashboard stats (shared between API /stats and web dashboard)
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
async def get_dashboard_stats(session: AsyncSession) -> dict: async def get_dashboard_stats(session: AsyncSession) -> dict:
"""Return all dashboard statistics as a single dict.""" """Return all dashboard statistics as a single dict."""
total_scans = await session.scalar(select(func.count(Scan.id))) total_scans = await session.scalar(select(func.count(Scan.id)))
flagged_scans = await session.scalar( flagged_scans = await session.scalar(select(func.count(Scan.id)).where(Scan.flagged == True))
select(func.count(Scan.id)).where(Scan.flagged == True)
)
recent_flagged = await session.scalar( recent_flagged = await session.scalar(
select(func.count(Scan.id)).where( select(func.count(Scan.id)).where(
Scan.flagged == True, Scan.flagged == True,
@@ -165,9 +160,7 @@ async def get_dashboard_stats(session: AsyncSession) -> dict:
latest_scans = ( latest_scans = (
( (
await session.execute( await session.execute(
select(Scan) select(Scan).order_by(Scan.started_at.desc()).limit(DASHBOARD_LATEST_SCANS_LIMIT)
.order_by(Scan.started_at.desc())
.limit(DASHBOARD_LATEST_SCANS_LIMIT)
) )
) )
.scalars() .scalars()

View File

@@ -84,6 +84,13 @@ _STRINGS = {
"llm_retry": {"en": "Retry", "ru": "Повторить"}, "llm_retry": {"en": "Retry", "ru": "Повторить"},
"llm_analyzed": {"en": "LLM analyzed", "ru": "LLM проанализ."}, "llm_analyzed": {"en": "LLM analyzed", "ru": "LLM проанализ."},
"llm_pending": {"en": "Pending", "ru": "Ожидают"}, "llm_pending": {"en": "Pending", "ru": "Ожидают"},
"total_scans_label": {"en": "Scans", "ru": "Сканов"},
"flagged_scans_label": {"en": "Flagged", "ru": "Помечено"},
"heading_top_rules": {"en": "Top Finding Rules", "ru": "Топ правил"},
"status_scanning": {"en": "scanning", "ru": "сканирование"},
"status_pending": {"en": "pending", "ru": "ожидание"},
"status_completed": {"en": "completed", "ru": "завершено"},
"status_failed": {"en": "failed", "ru": "ошибка"},
"not_found": {"en": "Not found", "ru": "Не найдено"}, "not_found": {"en": "Not found", "ru": "Не найдено"},
"breadcrumb_home": {"en": "Home", "ru": "Главная"}, "breadcrumb_home": {"en": "Home", "ru": "Главная"},
"breadcrumb_dashboard": {"en": "Dashboard", "ru": "Панель"}, "breadcrumb_dashboard": {"en": "Dashboard", "ru": "Панель"},

View File

@@ -52,5 +52,3 @@ async def list_findings(
for f in findings for f in findings
], ],
} }

View File

@@ -7,6 +7,7 @@ from urllib.parse import unquote
from fastapi import APIRouter, Depends, Query, Response from fastapi import APIRouter, Depends, Query, Response
from sqlalchemy import select from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.orm import selectinload
from ..constants import ( from ..constants import (
CSV_MEDIA_TYPE, CSV_MEDIA_TYPE,
@@ -17,7 +18,7 @@ from ..constants import (
MAX_PAGE_SIZE, MAX_PAGE_SIZE,
) )
from ..db.engine import get_session from ..db.engine import get_session
from ..db.models import Finding, Scan from ..db.models import Scan
from ..db.queries import build_package_list_query from ..db.queries import build_package_list_query
router = APIRouter(prefix="/api/v1/packages", tags=["packages"]) router = APIRouter(prefix="/api/v1/packages", tags=["packages"])
@@ -88,14 +89,22 @@ async def export_packages_csv(
writer = csv.writer(output) writer = csv.writer(output)
writer.writerow( writer.writerow(
[ [
"name", "version", "ecosystem", "repository", "name",
"last_scanned_at", "flagged", "total_findings", "version",
"ecosystem",
"repository",
"last_scanned_at",
"flagged",
"total_findings",
] ]
) )
for r in rows: for r in rows:
writer.writerow( writer.writerow(
[ [
r.pkg_name, r.pkg_ver, r.ecosystem, r.repository, r.pkg_name,
r.pkg_ver,
r.ecosystem,
r.repository,
r.last_scan.isoformat() if r.last_scan else "", r.last_scan.isoformat() if r.last_scan else "",
bool(r.is_flagged), bool(r.is_flagged),
r.findings_sum, r.findings_sum,
@@ -123,6 +132,7 @@ async def get_package(
await session.execute( await session.execute(
select(Scan) select(Scan)
.where(Scan.package_name == pkg_name, Scan.package_version == pkg_version) .where(Scan.package_name == pkg_name, Scan.package_version == pkg_version)
.options(selectinload(Scan.findings))
.order_by(Scan.started_at.desc()) .order_by(Scan.started_at.desc())
) )
) )
@@ -135,12 +145,7 @@ async def get_package(
all_findings: list[dict] = [] all_findings: list[dict] = []
for s in scans: for s in scans:
findings = ( for f in s.findings:
(await session.execute(select(Finding).where(Finding.scan_id == s.id)))
.scalars()
.all()
)
for f in findings:
all_findings.append({"id": f.id, **f.data, "report": f.report}) all_findings.append({"id": f.id, **f.data, "report": f.report})
return { return {

View File

@@ -93,16 +93,31 @@ async def export_scans_csv(
writer = csv.writer(output) writer = csv.writer(output)
writer.writerow( writer.writerow(
[ [
"id", "package_name", "package_version", "ecosystem", "repository", "id",
"status", "total_findings", "flagged", "started_at", "finished_at", "package_name",
"error_message", "sha256", "package_version",
"ecosystem",
"repository",
"status",
"total_findings",
"flagged",
"started_at",
"finished_at",
"error_message",
"sha256",
] ]
) )
for s in scans: for s in scans:
writer.writerow( writer.writerow(
[ [
s.id, s.package_name, s.package_version, s.ecosystem, s.repository, s.id,
s.status, s.total_findings, s.flagged, s.package_name,
s.package_version,
s.ecosystem,
s.repository,
s.status,
s.total_findings,
s.flagged,
s.started_at.isoformat() if s.started_at else "", s.started_at.isoformat() if s.started_at else "",
s.finished_at.isoformat() if s.finished_at else "", s.finished_at.isoformat() if s.finished_at else "",
s.error_message or "", s.error_message or "",

View File

@@ -15,31 +15,23 @@ router = APIRouter(tags=["metrics"])
@router.get("/metrics") @router.get("/metrics")
async def metrics(session: AsyncSession = Depends(get_session)): async def metrics(session: AsyncSession = Depends(get_session)):
total = await session.scalar(select(func.count(Scan.id))) or 0 total = await session.scalar(select(func.count(Scan.id))) or 0
flagged = await session.scalar( flagged = await session.scalar(select(func.count(Scan.id)).where(Scan.flagged == True)) or 0
select(func.count(Scan.id)).where(Scan.flagged == True)
) or 0
findings_total = await session.scalar(select(func.count(Finding.id))) or 0 findings_total = await session.scalar(select(func.count(Finding.id))) or 0
# By status # By status
status_rows = ( status_rows = (
await session.execute( await session.execute(select(Scan.status, func.count(Scan.id)).group_by(Scan.status))
select(Scan.status, func.count(Scan.id)).group_by(Scan.status)
)
).all() ).all()
by_status = {row[0]: row[1] for row in status_rows} by_status = {row[0]: row[1] for row in status_rows}
# By ecosystem # By ecosystem
eco_rows = ( eco_rows = (
await session.execute( await session.execute(select(Scan.ecosystem, func.count(Scan.id)).group_by(Scan.ecosystem))
select(Scan.ecosystem, func.count(Scan.id)).group_by(Scan.ecosystem)
)
).all() ).all()
by_eco = {row[0]: row[1] for row in eco_rows} by_eco = {row[0]: row[1] for row in eco_rows}
# Latest scan timestamp # Latest scan timestamp
latest = await session.scalar( latest = await session.scalar(select(func.max(Scan.started_at)))
select(func.max(Scan.started_at))
)
lines = [ lines = [
"# HELP guarddog_scans_total Total number of package scans.", "# HELP guarddog_scans_total Total number of package scans.",

View File

@@ -41,7 +41,8 @@ _jinja_env.globals["config"] = config
def _render(name: str, **context) -> HTMLResponse: def _render(name: str, **context) -> HTMLResponse:
template = _jinja_env.get_template(name) template = _jinja_env.get_template(name)
return HTMLResponse(template.render(**context)) status_code = context.pop("_status_code", 200)
return HTMLResponse(template.render(**context), status_code=status_code)
@router.get("/", response_class=HTMLResponse) @router.get("/", response_class=HTMLResponse)
@@ -104,18 +105,14 @@ async def scans_list(
@router.get("/scans/{scan_id}", response_class=HTMLResponse) @router.get("/scans/{scan_id}", response_class=HTMLResponse)
async def scan_detail( async def scan_detail(scan_id: int, request: Request, session: AsyncSession = Depends(get_session)):
scan_id: int, request: Request, session: AsyncSession = Depends(get_session)
):
from sqlalchemy.orm import selectinload from sqlalchemy.orm import selectinload
scan = await session.scalar( scan = await session.scalar(
select(Scan) select(Scan).where(Scan.id == scan_id).options(selectinload(Scan.findings))
.where(Scan.id == scan_id)
.options(selectinload(Scan.findings))
) )
if not scan: if not scan:
return HTMLResponse(f"<h1>{_t('not_found', request.state.lang)}</h1>", status_code=404) return _render("404.html", request=request, _status_code=404)
return _render("scan_detail.html", scan=scan, request=request) return _render("scan_detail.html", scan=scan, request=request)
@@ -192,7 +189,7 @@ async def package_detail(
) )
if not scans: if not scans:
return HTMLResponse(f"<h1>{_t('not_found', request.state.lang)}</h1>", status_code=404) return _render("404.html", request=request, _status_code=404)
all_findings = [] all_findings = []
for s in scans: for s in scans:
@@ -223,9 +220,7 @@ async def analyze_finding_htmx(
if not config.llm_enabled: if not config.llm_enabled:
msg = _t("llm_disabled", lang) msg = _t("llm_disabled", lang)
return HTMLResponse( return HTMLResponse(f'<div class="llm-actions"><small class="flagged">{msg}</small></div>')
f'<div class="llm-actions"><small class="flagged">{msg}</small></div>'
)
finding = await session.scalar(select(Finding).where(Finding.id == finding_id)) finding = await session.scalar(select(Finding).where(Finding.id == finding_id))
if not finding: if not finding:
@@ -252,6 +247,8 @@ async def analyze_finding_htmx(
lock = _llm_locks[finding_id] lock = _llm_locks[finding_id]
if lock.locked(): if lock.locked():
async with _llm_lock:
_llm_locks.pop(finding_id, None)
return _render("_llm_spinner.html", request=request) return _render("_llm_spinner.html", request=request)
async with lock: async with lock:
@@ -267,9 +264,7 @@ async def analyze_finding_htmx(
finding.report = None finding.report = None
await session.commit() await session.commit()
msg = _t("llm_failed", lang) msg = _t("llm_failed", lang)
return HTMLResponse( return HTMLResponse(f'<div class="llm-actions"><small class="flagged">{msg}</small></div>')
f'<div class="llm-actions"><small class="flagged">{msg}</small></div>'
)
finding.report = report finding.report = report
await session.commit() await session.commit()

View File

@@ -4,6 +4,7 @@ import hashlib
import hmac import hmac
import json import json
import re import re
from urllib.parse import urlencode
from fastapi import APIRouter, BackgroundTasks, Header, HTTPException, Request, status from fastapi import APIRouter, BackgroundTasks, Header, HTTPException, Request, status
@@ -58,7 +59,7 @@ def _detect_ecosystem(source: dict) -> str:
return "go" return "go"
if fmt in ("npm", "node"): if fmt in ("npm", "node"):
return "npm" return "npm"
return fmt or DEFAULT_ECOSYSTEM return DEFAULT_ECOSYSTEM
@router.post("/nexus") @router.post("/nexus")
@@ -75,22 +76,16 @@ async def nexus_webhook(
raise HTTPException( raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing signature" status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing signature"
) )
expected = hmac.new( expected = hmac.new(config.webhook_secret.encode(), payload, hashlib.sha256).hexdigest()
config.webhook_secret.encode(), payload, hashlib.sha256
).hexdigest()
if not hmac.compare_digest(x_nexus_webhook_signature, expected): if not hmac.compare_digest(x_nexus_webhook_signature, expected):
log.warning("Webhook rejected: invalid signature") log.warning("Webhook rejected: invalid signature")
raise HTTPException( raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Invalid signature")
status_code=status.HTTP_403_FORBIDDEN, detail="Invalid signature"
)
try: try:
data = json.loads(payload.decode("utf-8")) data = json.loads(payload.decode("utf-8"))
except (json.JSONDecodeError, UnicodeDecodeError): except (json.JSONDecodeError, UnicodeDecodeError):
log.warning("Webhook received invalid body") log.warning("Webhook received invalid body")
raise HTTPException( raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid request body")
status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid request body"
)
action = data.get("action", "").upper() action = data.get("action", "").upper()
if action not in RELEVANT_WEBHOOK_ACTIONS: if action not in RELEVANT_WEBHOOK_ACTIONS:
@@ -108,8 +103,7 @@ async def nexus_webhook(
initiator = raw_initiator initiator = raw_initiator
source_ip = request.client.host if request.client else None source_ip = request.client.host if request.client else None
log.info("Webhook: action=%s initiator=%s source_ip=%s", log.info("Webhook: action=%s initiator=%s source_ip=%s", action, initiator, source_ip)
action, initiator, source_ip)
repository = data.get("repositoryName", "") repository = data.get("repositoryName", "")
if not repository: if not repository:
@@ -125,16 +119,19 @@ async def nexus_webhook(
if not asset_path or not _is_package_asset(asset_path): if not asset_path or not _is_package_asset(asset_path):
return {"status": WEBHOOK_STATUS_IGNORED, "reason": WEBHOOK_IGNORE_NON_PACKAGE} return {"status": WEBHOOK_STATUS_IGNORED, "reason": WEBHOOK_IGNORE_NON_PACKAGE}
download_url = asset.get("downloadUrl") or _build_download_url( download_url = asset.get("downloadUrl") or _build_download_url(repository, asset_path)
repository, asset_path
)
ecosystem = _detect_ecosystem(asset) ecosystem = _detect_ecosystem(asset)
log.info("Webhook: %s asset %s (%s) in %s", action, asset_path, ecosystem, repository) log.info("Webhook: %s asset %s (%s) in %s", action, asset_path, ecosystem, repository)
background_tasks.add_task( background_tasks.add_task(
_scan_in_background, download_url, repository, ecosystem, asset_path, _scan_in_background,
initiator=initiator, source_ip=source_ip, download_url,
repository,
ecosystem,
asset_path,
initiator=initiator,
source_ip=source_ip,
) )
return {"status": WEBHOOK_STATUS_ACCEPTED, "asset": asset_path, "action": action} return {"status": WEBHOOK_STATUS_ACCEPTED, "asset": asset_path, "action": action}
@@ -164,10 +161,15 @@ async def nexus_webhook(
async def _scan_component(repository: str, name: str, version: str, ecosystem: str): async def _scan_component(repository: str, name: str, version: str, ecosystem: str):
from ..core.nexus import nexus_get from ..core.nexus import nexus_get
api_path = ( params = urlencode(
f"/service/rest/v1/search" {
f"?repository={repository}&name={name}&version={version}&format={ecosystem}" "repository": repository,
"name": name,
"version": version,
"format": ecosystem,
}
) )
api_path = f"/service/rest/v1/search?{params}"
try: try:
resp = await nexus_get(api_path) resp = await nexus_get(api_path)
resp.raise_for_status() resp.raise_for_status()
@@ -186,14 +188,10 @@ async def _scan_component(repository: str, name: str, version: str, ecosystem: s
asset_path = _extract_asset_path(asset) asset_path = _extract_asset_path(asset)
if not asset_path or not _is_package_asset(asset_path): if not asset_path or not _is_package_asset(asset_path):
continue continue
download_url = asset.get("downloadUrl") or _build_download_url( download_url = asset.get("downloadUrl") or _build_download_url(repository, asset_path)
repository, asset_path
)
log.info("Scanning component asset: %s", asset_path) log.info("Scanning component asset: %s", asset_path)
async for session in get_session(): async for session in get_session():
await harvest( await harvest(download_url, repository, ecosystem, asset_path, session)
download_url, repository, ecosystem, asset_path, session
)
break break
@@ -208,8 +206,13 @@ async def _scan_in_background(
try: try:
async for session in get_session(): async for session in get_session():
await harvest( await harvest(
download_url, repository, format_, asset_path, session, download_url,
initiator=initiator, source_ip=source_ip, repository,
format_,
asset_path,
session,
initiator=initiator,
source_ip=source_ip,
) )
break break
except Exception as e: except Exception as e:

View File

@@ -35,7 +35,7 @@
/* ------------------------------------------------------------------ */ /* ------------------------------------------------------------------ */
/* Tables */ /* Tables */
/* ------------------------------------------------------------------ */ /* ------------------------------------------------------------------ */
table { font-size: 0.9rem; } table { font-size: 0.9rem; display: block; overflow-x: auto; }
table.compact { font-size: 0.82rem; } table.compact { font-size: 0.82rem; }
table.compact th, table.compact th,
table.compact td { padding: 0.35rem 0.5rem; } table.compact td { padding: 0.35rem 0.5rem; }

View File

@@ -0,0 +1,6 @@
{% extends "base.html" %}
{% block title %}{{ t('not_found', request.state.lang) }}{% endblock %}
{% block content %}
<h1>{{ t('not_found', request.state.lang) }}</h1>
<p><a href="/">{{ t('nav_dashboard', request.state.lang) }}</a></p>
{% endblock %}

View File

@@ -2,9 +2,15 @@
{% if total_pages > 1 %} {% if total_pages > 1 %}
<nav> <nav>
<ul> <ul>
<li>{% if page > 1 %}<a href="?page={{ page - 1 }}{% if flagged_filter %}&flagged={{ flagged_filter }}{% endif %}{% if search %}&search={{ search }}{% endif %}{% if status_filter %}&status={{ status_filter }}{% endif %}&sort_by={{ sort_by }}&sort_dir={{ sort_dir }}">{{ t('btn_prev', request.state.lang) }}</a>{% else %}<span>{{ t('btn_prev', request.state.lang) }}</span>{% endif %}</li> <li>{% if page > 1 %}<a
hx-get="{{ url_prefix or '' }}?page={{ page - 1 }}{% if flagged_filter %}&flagged={{ flagged_filter }}{% endif %}{% if search %}&search={{ search }}{% endif %}{% if status_filter %}&status={{ status_filter }}{% endif %}&sort_by={{ sort_by }}&sort_dir={{ sort_dir }}"
hx-target="{{ hx_target or '#scans-table-container' }}"
hx-swap="innerHTML">{{ t('btn_prev', request.state.lang) }}</a>{% else %}<span>{{ t('btn_prev', request.state.lang) }}</span>{% endif %}</li>
<li><small>{{ t('page_label', request.state.lang) }} {{ page }} {{ t('page_of', request.state.lang) }} {{ total_pages }}</small></li> <li><small>{{ t('page_label', request.state.lang) }} {{ page }} {{ t('page_of', request.state.lang) }} {{ total_pages }}</small></li>
<li>{% if page < total_pages %}<a href="?page={{ page + 1 }}{% if flagged_filter %}&flagged={{ flagged_filter }}{% endif %}{% if search %}&search={{ search }}{% endif %}{% if status_filter %}&status={{ status_filter }}{% endif %}&sort_by={{ sort_by }}&sort_dir={{ sort_dir }}">{{ t('btn_next', request.state.lang) }}</a>{% else %}<span>{{ t('btn_next', request.state.lang) }}</span>{% endif %}</li> <li>{% if page < total_pages %}<a
hx-get="{{ url_prefix or '' }}?page={{ page + 1 }}{% if flagged_filter %}&flagged={{ flagged_filter }}{% endif %}{% if search %}&search={{ search }}{% endif %}{% if status_filter %}&status={{ status_filter }}{% endif %}&sort_by={{ sort_by }}&sort_dir={{ sort_dir }}"
hx-target="{{ hx_target or '#scans-table-container' }}"
hx-swap="innerHTML">{{ t('btn_next', request.state.lang) }}</a>{% else %}<span>{{ t('btn_next', request.state.lang) }}</span>{% endif %}</li>
</ul> </ul>
</nav> </nav>
{% endif %} {% endif %}

View File

@@ -24,7 +24,7 @@
{% for s in scans %} {% for s in scans %}
<tr> <tr>
<td><a href="/scans/{{ s.id }}">#{{ s.id }}</a></td> <td><a href="/scans/{{ s.id }}">#{{ s.id }}</a></td>
<td>{{ s.package_name }}</td> <td><a href="/packages/{{ s.package_name | urlencode }}/{{ s.package_version | urlencode }}">{{ s.package_name }}</a></td>
<td>{{ s.package_version }}</td> <td>{{ s.package_version }}</td>
<td>{{ s.repository }}</td> <td>{{ s.repository }}</td>
<td> <td>

View File

@@ -1 +1,2 @@
{% if status == 'scanning' %}<span class="status-scanning"><span class="spinner"></span>scanning</span>{% else %}<span class="status-{{ status }}">{{ status }}</span>{% endif %} {% set label = t('status_' + status, request.state.lang) %}
{% if status == 'scanning' %}<span class="status-scanning"><span class="spinner"></span>{{ label }}</span>{% else %}<span class="status-{{ status }}">{{ label }}</span>{% endif %}

View File

@@ -1,10 +1,24 @@
{% if total_findings %} {% if total_findings %}
<div style="display:flex; gap:1.5rem; padding:0.3rem 0; margin-bottom:1rem; border-bottom:1px solid var(--pico-color-gray-500); font-size:0.82rem; opacity:0.8;"> <div style="display:flex; gap:1.5rem; padding:0.3rem 0; margin-bottom:1rem; border-bottom:1px solid var(--pico-color-gray-500); font-size:0.82rem; opacity:0.8; flex-wrap:wrap;">
<span>{{ t('total_scans_label', request.state.lang) }}: <strong>{{ total_scans }}</strong></span>
<span>{{ t('flagged_scans_label', request.state.lang) }}: <strong>{{ flagged_scans }}</strong></span>
<span>{{ t('col_findings', request.state.lang) }}: <strong>{{ total_findings }}</strong></span> <span>{{ t('col_findings', request.state.lang) }}: <strong>{{ total_findings }}</strong></span>
<span>{{ t('llm_analyzed', request.state.lang) }}: <strong>{{ llm_analyzed }}</strong></span> <span>{{ t('llm_analyzed', request.state.lang) }}: <strong>{{ llm_analyzed }}</strong></span>
<span>{{ t('llm_pending', request.state.lang) }}: <strong>{{ llm_pending }}</strong></span> <span>{{ t('llm_pending', request.state.lang) }}: <strong>{{ llm_pending }}</strong></span>
</div> </div>
{% endif %} {% endif %}
{% if top_rules %}
<article class="dash-block" style="margin-bottom:1rem;">
<h3>{{ t('heading_top_rules', request.state.lang) }}</h3>
<table class="compact">
<tbody>
{% for r in top_rules %}
<tr><td><strong>{{ r.rule }}</strong></td><td>{{ r.count }}</td></tr>
{% endfor %}
</tbody>
</table>
</article>
{% endif %}
{% if latest_flagged %} {% if latest_flagged %}
<article class="dash-block dash-block-warn"> <article class="dash-block dash-block-warn">
<h3>{{ t('heading_latest_flagged', request.state.lang) }}</h3> <h3>{{ t('heading_latest_flagged', request.state.lang) }}</h3>

View File

@@ -11,7 +11,8 @@
<h1>{{ t('heading_packages', request.state.lang) }}</h1> <h1>{{ t('heading_packages', request.state.lang) }}</h1>
<div class="filter-bar"> <div class="filter-bar">
<input type="text" name="search" placeholder="{{ t('filter_search', request.state.lang) }}" value="{{ search }}" hx-get="/packages" hx-trigger="input changed, keyup[entered] delay:300ms" hx-target="#packages-table-container" hx-swap="innerHTML"> <input type="hidden" name="flagged" value="{{ flagged_filter }}">
<input type="text" name="search" placeholder="{{ t('filter_search', request.state.lang) }}" value="{{ search }}" hx-get="/packages" hx-trigger="input changed, keyup[entered] delay:300ms" hx-target="#packages-table-container" hx-swap="innerHTML" hx-include="[name=flagged]">
<a href="?flagged={% if flagged_filter == '1' %}0{% else %}1{% endif %}" role="button" class="outline"> <a href="?flagged={% if flagged_filter == '1' %}0{% else %}1{% endif %}" role="button" class="outline">
{% if flagged_filter == '1' %}{{ t('btn_show_all', request.state.lang) }}{% else %}{{ t('btn_flagged_only', request.state.lang) }}{% endif %} {% if flagged_filter == '1' %}{{ t('btn_show_all', request.state.lang) }}{% else %}{{ t('btn_flagged_only', request.state.lang) }}{% endif %}
</a> </a>

View File

@@ -11,8 +11,9 @@
<h1>{{ t('heading_scans', request.state.lang) }}</h1> <h1>{{ t('heading_scans', request.state.lang) }}</h1>
<div class="filter-bar"> <div class="filter-bar">
<input type="text" name="search" placeholder="{{ t('filter_search', request.state.lang) }}" value="{{ search }}" hx-get="/scans" hx-trigger="input changed, keyup[entered] delay:300ms" hx-target="#scans-table-container" hx-swap="innerHTML" hx-include="#status-filter"> <input type="hidden" name="flagged" value="{{ flagged_filter }}">
<select name="status" id="status-filter" hx-get="/scans" hx-trigger="change" hx-target="#scans-table-container" hx-swap="innerHTML" hx-include="[name=search]"> <input type="text" name="search" placeholder="{{ t('filter_search', request.state.lang) }}" value="{{ search }}" hx-get="/scans" hx-trigger="input changed, keyup[entered] delay:300ms" hx-target="#scans-table-container" hx-swap="innerHTML" hx-include="#status-filter,[name=flagged]">
<select name="status" id="status-filter" hx-get="/scans" hx-trigger="change" hx-target="#scans-table-container" hx-swap="innerHTML" hx-include="[name=search],[name=flagged]">
<option value="">{{ t('filter_all_statuses', request.state.lang) }}</option> <option value="">{{ t('filter_all_statuses', request.state.lang) }}</option>
<option value="pending" {% if status_filter == 'pending' %}selected{% endif %}>{{ t('filter_pending', request.state.lang) }}</option> <option value="pending" {% if status_filter == 'pending' %}selected{% endif %}>{{ t('filter_pending', request.state.lang) }}</option>
<option value="scanning" {% if status_filter == 'scanning' %}selected{% endif %}>{{ t('filter_scanning', request.state.lang) }}</option> <option value="scanning" {% if status_filter == 'scanning' %}selected{% endif %}>{{ t('filter_scanning', request.state.lang) }}</option>

View File

@@ -12,11 +12,11 @@ license = {text = "MIT"}
dependencies = [ dependencies = [
"fastapi>=0.115.0", "fastapi>=0.115.0",
"uvicorn[standard]>=0.30.0", "uvicorn[standard]>=0.30.0",
"jinja2>=3.1.0", "jinja2>=3.1.4",
"httpx>=0.27.0", "httpx>=0.28.0",
"sqlalchemy[asyncio]>=2.0.30", "sqlalchemy[asyncio]>=2.0.30",
"aiosqlite>=0.20.0", "aiosqlite>=0.20.0",
"python-multipart>=0.0.9", "python-multipart>=0.0.18",
] ]
[project.optional-dependencies] [project.optional-dependencies]

View File

@@ -12,6 +12,7 @@ async def test_health(client):
# --- Scans --- # --- Scans ---
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_list_scans_empty(client): async def test_list_scans_empty(client):
resp = await client.get("/api/v1/scans") resp = await client.get("/api/v1/scans")
@@ -77,6 +78,7 @@ async def test_scans_csv_export_with_filter(client, sample_flagged_scan):
# --- Packages --- # --- Packages ---
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_list_packages_empty(client): async def test_list_packages_empty(client):
resp = await client.get("/api/v1/packages") resp = await client.get("/api/v1/packages")
@@ -133,6 +135,7 @@ async def test_package_not_found(client):
# --- Findings --- # --- Findings ---
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_list_findings_empty(client): async def test_list_findings_empty(client):
resp = await client.get("/api/v1/findings") resp = await client.get("/api/v1/findings")
@@ -163,6 +166,7 @@ async def test_list_findings_with_filters(client, sample_flagged_scan):
# --- Web UI --- # --- Web UI ---
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_web_ui_dashboard(client): async def test_web_ui_dashboard(client):
resp = await client.get("/") resp = await client.get("/")

232
tests/test_llm_analysis.py Normal file
View File

@@ -0,0 +1,232 @@
"""Tests for LLM analysis — endpoint and client."""
from unittest.mock import MagicMock, patch
import pytest
from guarddog_nexus.db.models import Finding
@pytest.fixture
async def sample_finding(db_session):
from guarddog_nexus.constants import SEVERITY_WARNING
finding = Finding(
scan_id=1,
data={
"rule": "shady-links",
"severity": SEVERITY_WARNING,
"message": "Suspicious URL",
"location": "setup.py:15",
"code": "url = 'http://evil.com'",
},
)
db_session.add(finding)
await db_session.commit()
await db_session.refresh(finding)
return finding
@pytest.fixture
async def sample_finding_with_report(db_session):
from guarddog_nexus.constants import SEVERITY_WARNING
report = {
"verdict": "safe",
"summary": "ok",
"analysis": "all good",
"severity_rating": "low",
}
finding = Finding(
scan_id=1,
data={"rule": "test", "severity": SEVERITY_WARNING, "message": "test"},
report=report,
)
db_session.add(finding)
await db_session.commit()
await db_session.refresh(finding)
return finding
# --- T4: analyze_finding() client ---
@pytest.mark.asyncio
async def test_analyze_finding_no_api_key():
from guarddog_nexus.core.llm import analyze_finding
result = await analyze_finding({"rule": "test", "severity": "WARNING"})
assert result is None
@pytest.mark.asyncio
async def test_analyze_finding_timeout():
import guarddog_nexus.config
from guarddog_nexus.core.llm import analyze_finding
guarddog_nexus.config.config.llm_api_key = "sk-test"
guarddog_nexus.config.config.llm_timeout = 1
import httpx
with patch("httpx.AsyncClient.post", side_effect=httpx.TimeoutException("timeout")):
result = await analyze_finding({"rule": "test", "severity": "WARNING"})
assert result is None
guarddog_nexus.config.config.llm_api_key = ""
@pytest.mark.asyncio
async def test_analyze_finding_api_error():
import guarddog_nexus.config
from guarddog_nexus.core.llm import analyze_finding
guarddog_nexus.config.config.llm_api_key = "sk-test"
guarddog_nexus.config.config.llm_timeout = 30
with patch("httpx.AsyncClient.post", side_effect=Exception("connection refused")):
result = await analyze_finding({"rule": "test", "severity": "WARNING"})
assert result is None
guarddog_nexus.config.config.llm_api_key = ""
@pytest.mark.asyncio
async def test_analyze_finding_success():
import guarddog_nexus.config
from guarddog_nexus.core.llm import analyze_finding
guarddog_nexus.config.config.llm_api_key = "sk-test"
guarddog_nexus.config.config.llm_timeout = 30
mock_resp = MagicMock()
mock_resp.raise_for_status.return_value = None
mock_resp.json.return_value = {
"choices": [
{
"message": {
"content": '{"verdict":"safe","summary":"ok",'
'"analysis":"fine","severity_rating":"low"}',
}
}
]
}
with patch("guarddog_nexus.core.llm.httpx.AsyncClient.post", return_value=mock_resp):
result = await analyze_finding({"rule": "test"})
assert result is not None
assert result["verdict"] == "safe"
assert result["severity_rating"] == "low"
guarddog_nexus.config.config.llm_api_key = ""
@pytest.mark.asyncio
async def test_analyze_finding_markdown_unwrap():
import guarddog_nexus.config
from guarddog_nexus.core.llm import analyze_finding
guarddog_nexus.config.config.llm_api_key = "sk-test"
mock_resp = MagicMock()
mock_resp.raise_for_status.return_value = None
mock_resp.json.return_value = {
"choices": [
{
"message": {
"content": '```json\n{"verdict":"suspicious","summary":"hm",'
'"analysis":"...","severity_rating":"medium"}\n```',
}
}
]
}
with patch("guarddog_nexus.core.llm.httpx.AsyncClient.post", return_value=mock_resp):
result = await analyze_finding({"rule": "test"})
assert result is not None
assert result["verdict"] == "suspicious"
guarddog_nexus.config.config.llm_api_key = ""
# --- T1: analyze_finding_htmx endpoint ---
@pytest.mark.asyncio
async def test_analyze_endpoint_llm_disabled(client, sample_finding):
import guarddog_nexus.config
guarddog_nexus.config.config.llm_enabled = False
resp = await client.post(f"/api/v1/findings/{sample_finding.id}/analyze")
assert resp.status_code == 200
assert "disabled" in resp.text.lower()
guarddog_nexus.config.config.llm_enabled = False
@pytest.mark.asyncio
async def test_analyze_endpoint_not_found(client):
import guarddog_nexus.config
guarddog_nexus.config.config.llm_enabled = True
resp = await client.post("/api/v1/findings/99999/analyze")
assert resp.status_code == 404
assert "not found" in resp.text.lower()
guarddog_nexus.config.config.llm_enabled = False
@pytest.mark.asyncio
async def test_analyze_endpoint_idempotent_already_analyzed(client, sample_finding_with_report):
import guarddog_nexus.config
guarddog_nexus.config.config.llm_enabled = True
resp = await client.post(f"/api/v1/findings/{sample_finding_with_report.id}/analyze")
assert resp.status_code == 200
assert "safe" in resp.text
guarddog_nexus.config.config.llm_enabled = False
@pytest.mark.asyncio
async def test_analyze_endpoint_success(client, sample_finding):
import guarddog_nexus.config
guarddog_nexus.config.config.llm_enabled = True
fake_report = {
"verdict": "malicious",
"summary": "bad",
"analysis": "evil",
"severity_rating": "critical",
}
async def mock_analyze(data):
return fake_report
with patch("guarddog_nexus.core.llm.analyze_finding", mock_analyze):
resp = await client.post(f"/api/v1/findings/{sample_finding.id}/analyze")
assert resp.status_code == 200
assert "malicious" in resp.text
guarddog_nexus.config.config.llm_enabled = False
@pytest.mark.asyncio
async def test_analyze_endpoint_failure(client, sample_finding):
import guarddog_nexus.config
guarddog_nexus.config.config.llm_enabled = True
async def mock_analyze(data):
return None
with patch("guarddog_nexus.core.llm.analyze_finding", mock_analyze):
resp = await client.post(f"/api/v1/findings/{sample_finding.id}/analyze")
assert resp.status_code == 200
assert "failed" in resp.text.lower()
guarddog_nexus.config.config.llm_enabled = False

View File

@@ -1,6 +1,5 @@
"""Tests for Nexus package info extractors.""" """Tests for Nexus package info extractors."""
from guarddog_nexus.core.nexus import ( from guarddog_nexus.core.nexus import (
extract_go_info, extract_go_info,
extract_npm_info, extract_npm_info,
@@ -37,9 +36,10 @@ class TestGoExtractor:
) )
def test_long_module(self): def test_long_module(self):
assert extract_go_info( assert extract_go_info("/packages/github.com/gin-gonic/gin/@v/v1.9.0.zip") == (
"/packages/github.com/gin-gonic/gin/@v/v1.9.0.zip" "github.com/gin-gonic/gin",
) == ("github.com/gin-gonic/gin", "v1.9.0") "v1.9.0",
)
def test_no_at_v(self): def test_no_at_v(self):
assert extract_go_info("packages/some/pkg/v1.0.0.zip") is None assert extract_go_info("packages/some/pkg/v1.0.0.zip") is None
@@ -68,21 +68,22 @@ class TestNpmExtractor:
class TestDispatchExtractor: class TestDispatchExtractor:
def test_pypi(self): def test_pypi(self):
assert extract_package_info( assert extract_package_info("/packages/requests/2.31.0/requests-2.31.0.tar.gz", "pypi") == (
"/packages/requests/2.31.0/requests-2.31.0.tar.gz", "pypi" "requests",
) == ("requests", "2.31.0") "2.31.0",
)
def test_go(self): def test_go(self):
assert extract_package_info( assert extract_package_info("github.com/gorilla/mux/@v/v1.8.0.zip", "go") == (
"github.com/gorilla/mux/@v/v1.8.0.zip", "go" "github.com/gorilla/mux",
) == ("github.com/gorilla/mux", "v1.8.0") "v1.8.0",
)
def test_npm(self): def test_npm(self):
assert extract_package_info( assert extract_package_info("packages/lodash/-/lodash-4.17.21.tgz", "npm") == (
"packages/lodash/-/lodash-4.17.21.tgz", "npm" "lodash",
) == ("lodash", "4.17.21") "4.17.21",
)
def test_unknown_ecosystem(self): def test_unknown_ecosystem(self):
assert extract_package_info( assert extract_package_info("/packages/pkg/1.0/file.tar.gz", "unknown") == ("pkg", "1.0")
"/packages/pkg/1.0/file.tar.gz", "unknown"
) == ("pkg", "1.0")

View File

@@ -1,6 +1,11 @@
"""Tests for GuardDog scanner integration.""" """Tests for GuardDog scanner integration."""
from guarddog_nexus.core.scanner import _normalize_output import asyncio
from unittest.mock import MagicMock, patch
import pytest
from guarddog_nexus.core.scanner import _normalize_output, scan_package
def test_normalize_clean_output(guarddog_output_clean): def test_normalize_clean_output(guarddog_output_clean):
@@ -49,3 +54,47 @@ def test_normalize_semgrep_list():
assert len(result["findings"]) == 2 assert len(result["findings"]) == 2
assert result["findings"][0]["location"] == "setup.py:10" assert result["findings"][0]["location"] == "setup.py:10"
assert result["findings"][0]["severity"] == "ERROR" assert result["findings"][0]["severity"] == "ERROR"
# --- scan_package() error paths ---
@pytest.mark.asyncio
async def test_scan_package_timeout():
with patch("asyncio.wait_for", side_effect=asyncio.TimeoutError):
result = await scan_package("/tmp/test.tar.gz", "pypi")
assert result["findings"] == []
assert "timeout" in result["errors"][0]
@pytest.mark.asyncio
async def test_scan_package_binary_not_found():
with patch("asyncio.create_subprocess_exec", side_effect=FileNotFoundError):
result = await scan_package("/tmp/test.tar.gz", "pypi")
assert result["findings"] == []
assert "not_found" in result["errors"][0]
@pytest.mark.asyncio
async def test_scan_package_invalid_json():
mock_proc = MagicMock()
mock_proc.returncode = 0
mock_proc.communicate.return_value = (b"not valid json", b"")
with patch("asyncio.create_subprocess_exec", return_value=mock_proc):
with patch("asyncio.wait_for", return_value=(b"not valid json", b"")):
result = await scan_package("/tmp/test.tar.gz", "pypi")
assert result["findings"] == []
assert "json" in result["errors"][0]
@pytest.mark.asyncio
async def test_scan_package_non_zero_exit():
mock_proc = MagicMock()
mock_proc.returncode = 2
with patch("asyncio.create_subprocess_exec", return_value=mock_proc):
with patch("asyncio.wait_for", return_value=(b"{}", b"guarddog: corrupted")):
result = await scan_package("/tmp/test.tar.gz", "pypi")
assert result["findings"] == []
assert "guarddog" in result["errors"][0]

View File

@@ -86,6 +86,7 @@ async def test_webhook_component_no_version(client, sample_nexus_component_webho
# --- Ecosystem detection tests --- # --- Ecosystem detection tests ---
def test_detect_ecosystem_pypi(): def test_detect_ecosystem_pypi():
from guarddog_nexus.routes.webhooks import _detect_ecosystem from guarddog_nexus.routes.webhooks import _detect_ecosystem
@@ -111,12 +112,13 @@ def test_detect_ecosystem_npm():
def test_detect_ecosystem_unknown(): def test_detect_ecosystem_unknown():
from guarddog_nexus.routes.webhooks import _detect_ecosystem from guarddog_nexus.routes.webhooks import _detect_ecosystem
assert _detect_ecosystem({"format": "maven"}) == "maven" assert _detect_ecosystem({"format": "maven"}) == "pypi" # unknown → default
assert _detect_ecosystem({}) == "pypi" # default assert _detect_ecosystem({}) == "pypi" # default
# --- Go/npm webhook integration --- # --- Go/npm webhook integration ---
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_webhook_go_asset(client, sample_nexus_go_webhook): async def test_webhook_go_asset(client, sample_nexus_go_webhook):
with patch("guarddog_nexus.routes.webhooks._scan_in_background") as _mock: with patch("guarddog_nexus.routes.webhooks._scan_in_background") as _mock:
@@ -131,3 +133,34 @@ async def test_webhook_npm_asset(client, sample_nexus_npm_webhook):
resp = await client.post("/webhooks/nexus", json=sample_nexus_npm_webhook) resp = await client.post("/webhooks/nexus", json=sample_nexus_npm_webhook)
assert resp.status_code == 200 assert resp.status_code == 200
assert resp.json()["status"] == "accepted" assert resp.json()["status"] == "accepted"
# --- Webhook signature validation ---
@pytest.mark.asyncio
async def test_webhook_missing_signature_when_required(client, sample_nexus_webhook):
import guarddog_nexus.config
guarddog_nexus.config.config.webhook_secret = "test-secret"
resp = await client.post("/webhooks/nexus", json=sample_nexus_webhook)
assert resp.status_code == 401
guarddog_nexus.config.config.webhook_secret = ""
@pytest.mark.asyncio
async def test_webhook_invalid_signature(client, sample_nexus_webhook):
import guarddog_nexus.config
guarddog_nexus.config.config.webhook_secret = "test-secret"
resp = await client.post(
"/webhooks/nexus",
json=sample_nexus_webhook,
headers={"X-Nexus-Webhook-Signature": "badsignature"},
)
assert resp.status_code == 403
guarddog_nexus.config.config.webhook_secret = ""