diff --git a/AGENTS.md b/AGENTS.md index 54eeb7e..b2233a6 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -26,7 +26,7 @@ For local development without Docker: make install dev export $(cat .env | xargs) python -m guarddog_nexus.main -make test # 85 tests +make test # 135 tests make lint # ruff make format # ruff format + fix ``` @@ -56,7 +56,8 @@ guarddog_nexus/ ├── web/ # Static assets │ ├── templates/ # Jinja2 templates │ └── static/ # CSS, JS -├── config.py # env-var configuration dataclass +├── schemas.py # Pydantic models + serialize_finding helper +├── config.py # env-var configuration dataclass ├── constants.py # all magic strings/limits ├── i18n.py # RU/EN translation dictionaries ├── logging_setup.py # JSON logging + syslog @@ -65,11 +66,11 @@ guarddog_nexus/ **Data flow:** 1. Nexus sends `UPDATED` webhook → `POST /webhooks/nexus` -2. `webhooks.py` validates signature, extracts asset info, spawns background task -3. `harvester.py` downloads file (async via `asyncio.to_thread`), computes SHA256, deduplicates +2. `webhooks.py` validates signature, detects ecosystem, rejects unknown ecosystems +3. `harvester.py` downloads file (async via `asyncio.to_thread`), validates URL against `NEXUS_ALLOWED_HOSTS` (SSRF protection), computes SHA256, deduplicates 4. `scanner.py` runs `guarddog scan --output-format json` 5. Findings stored in SQLite (`scans` + `findings` tables) -6. If `LLM_ENABLED=1` and `LLM_AUTO_ANALYZE=1`, `llm.py` sends each finding to the configured model. `finding.report` state machine: `None` → `{"status": "analyzing"}` → `{verdict, summary, analysis, severity_rating}` or `None` on failure. +6. If `LLM_ENABLED=1` and `LLM_AUTO_ANALYZE=1`, `llm.py` sends each finding to the configured model with retry logic. `finding.report` state machine: `None` → `{"status": "analyzing"}` → `{verdict, summary, analysis, severity_rating}` or `None` on failure. LLM response validated with defaults for missing fields. --- @@ -80,7 +81,7 @@ guarddog_nexus/ - **Line length:** 100 (ruff) - **Lint:** `ruff check guarddog_nexus tests` (E/F/I/W rules) - **Format:** `ruff format guarddog_nexus tests` -- **Tests:** `pytest -v` (85 tests, pytest-asyncio auto mode) +- **Tests:** `pytest -v` (135 tests, pytest-asyncio auto mode) - **Commits:** Russian descriptions, prefix convention: `feat:`, `fix:`, `refactor:`, `docs:`, `ui:` - **No comments** in code unless explicitly requested - **Async I/O:** file reads/writes wrapped in `asyncio.to_thread()` — never raw `open()` in async context @@ -138,7 +139,7 @@ Per-finding `asyncio.Lock` in `web.py` prevents concurrent analysis of the same ## Webhooks -Only `UPDATED` action is accepted (not `CREATED`). Format field in asset data determines ecosystem: `pypi`, `go`, `npm`. +Only `UPDATED` action is accepted (not `CREATED`). Format field in asset data determines ecosystem: `pypi`, `go`, `npm`. Unknown ecosystems are rejected explicitly (no silent fallback to pypi). Per-URL locking (asyncio.Lock) prevents parallel scans of the same asset. SHA256 dedup prevents re-scanning identical file content. @@ -163,7 +164,7 @@ docker compose down -v # stop + destroy volumes (make docker-destroy) docker compose logs -f # tail logs ``` -The Dockerfile parses `pyproject.toml` for dependency list (single source of truth). GuardDog is installed as a separate `uv pip install` step. +The Dockerfile uses `uv pip install . --system` to install the package and all dependencies from `pyproject.toml`. GuardDog is installed as a separate `uv pip install` step. --- @@ -173,7 +174,8 @@ The Dockerfile parses `pyproject.toml` for dependency list (single source of tru - Tests use in-memory SQLite (`:memory:`) - `conftest.py` sets up `os.environ` before importing the app - Mock `guarddog` output via fixtures — no real CLI execution -- 85 tests covering: API, webhooks, harvester, scanner, web UI +- 135 tests covering: API, webhooks, harvester, scanner, web UI, i18n, metrics, LLM analysis, e2e flows +- E2E tests in `tests/e2e/` cover full webhook-to-scan pipeline, API filtering/pagination, LLM analysis, and error handling When adding features: - Always `python3 -m pytest -v` before committing @@ -221,8 +223,6 @@ curl -X POST http://localhost:8080/webhooks/nexus \ --- -## Workflow - ## Workflow — MANDATORY after completing a feature or session **Before responding to the user, you MUST complete ALL of:** diff --git a/README.en.md b/README.en.md index b0ccb66..4a37dc9 100644 --- a/README.en.md +++ b/README.en.md @@ -52,6 +52,7 @@ After startup: | Variable | Default | Description | |----------|---------|-------------| | `NEXUS_URL` | `http://localhost:8081` | Sonatype Nexus URL | +| `NEXUS_ALLOWED_HOSTS` | host from `NEXUS_URL` | Allowed download hosts (comma-separated, SSRF protection) | | `DATABASE_PATH` | `data/guarddog.db` | SQLite database path | | `HOST` | `0.0.0.0` | Listen host | | `PORT` | `8080` | Listen port | diff --git a/README.md b/README.md index 168f154..9b84fec 100644 --- a/README.md +++ b/README.md @@ -69,6 +69,7 @@ python -m guarddog_nexus.main | Переменная | По умолчанию | Описание | |------------|-------------|----------| | `NEXUS_URL` | `http://localhost:8081` | URL Sonatype Nexus | +| `NEXUS_ALLOWED_HOSTS` | хост из `NEXUS_URL` | Разрешённые хосты для скачивания (через запятую, защита от SSRF) | | `DATABASE_PATH` | `data/guarddog.db` | Путь к SQLite-базе данных | | `HOST` | `0.0.0.0` | Хост для прослушивания | | `PORT` | `8080` | Порт для прослушивания |