The Docker API server applied its SSRF destination check (validate_url_destination) on the non-streaming /crawl path but not on the streaming path. handle_stream_crawl_request passed seed URLs straight to the crawler with no destination validation. A remote, unauthenticated client could call POST /crawl/stream (or POST /crawl with crawler_config.stream=true, which short-circuits to the same handler) with a URL pointing at an internal, private, or link-local address; the server fetched it and streamed the …
The Docker API server accepted a request-supplied browser_config.extra_args, which flowed into Chromium's launch arguments. An attacker could inject Chromium switches that replace a child-process launch command (–utility-cmd-prefix, –renderer-cmd-prefix, –gpu-launcher, –browser-subprocess-path) together with –no-zygote, causing Chromium to fork/exec an attacker-controlled command as the container's runtime user. The Docker API is unauthenticated by default, so a single request yields arbitrary command execution. The earlier extra_args SSRF patch (0.8.9) used a denylist scoped …
When the crawler saves a downloaded file, the destination filename was taken from attacker-influenced input and joined to the downloads directory with no confinement. A filename containing an absolute path (e.g. /etc/cron.d/evil) or ../ traversal escaped the downloads directory, giving an arbitrary file write with attacker-controlled contents. Because the written bytes are attacker-controlled, this escalates to remote code execution (overwriting a shell rc-file, ~/.ssh/authorized_keys, a cron entry, or a Python …
The Docker API server applied its SSRF destination check to the crawl target URL only, not to the proxy address. An unauthenticated request could supply a proxy pointing at an internal IP and route the browser through it, reaching internal services and cloud-metadata endpoints, while using a perfectly valid crawl URL. The Docker API is unauthenticated by default.
The Docker API server's SSRF protection (validate_webhook_url / validate_url_destination in deploy/docker/utils.py) used an explicit IPv4/IPv6 CIDR blocklist that missed several address families. An attacker could reach internal services and cloud metadata endpoints (e.g. 169.254.169.254) despite the filter by encoding an internal IPv4 address inside an IPv6 transition form, or by using the IPv6 unspecified address. Because the Docker API is unauthenticated by default (jwt_enabled: false), no credentials are required.
Multiple security vulnerabilities in the Crawl4AI Docker API server affecting endpoints for crawling, markdown/LLM extraction, screenshots, PDFs, webhooks, monitoring, JavaScript execution, and configuration.
The Docker API server let a request control where LLM calls were sent and which environment variable an LLM token resolved from. Both could be abused to exfiltrate server-held secrets. The Docker API is unauthenticated by default.
The _safe_eval_expression() function in the computed fields feature uses an AST validator that only blocks attributes starting with underscore. Python generator and frame object attributes (gi_frame, f_back, f_builtins) do NOT start with underscore, enabling a complete sandbox escape to achieve arbitrary code execution. The attack requires no authentication (JWT disabled by default) and is triggered via POST /crawl with a crafted extraction schema.
Three backward-compatible hardening fixes in the Docker API server. The headline issue is an arbitrary file write via the screenshot/PDF output_path.
A critical remote code execution vulnerability exists in the Crawl4AI Docker API deployment. The /crawl endpoint accepts a hooks parameter containing Python code that is executed using exec(). The import builtin was included in the allowed builtins, allowing attackers to import arbitrary modules and execute system commands. Attack Vector: POST /crawl { "urls": ["https://example.com"], "hooks": { "code": { "on_page_context_created": "async def hook(page, context, **kwargs):\n import('os').system('malicious_command')\n return page" } } }
A critical remote code execution vulnerability exists in the Crawl4AI Docker API deployment. The /crawl endpoint accepts a hooks parameter containing Python code that is executed using exec(). The import builtin was included in the allowed builtins, allowing attackers to import arbitrary modules and execute system commands. Attack Vector: POST /crawl { "urls": ["https://example.com"], "hooks": { "code": { "on_page_context_created": "async def hook(page, context, **kwargs):\n import('os').system('malicious_command')\n return page" } } }
A local file inclusion vulnerability exists in the Crawl4AI Docker API. The /execute_js, /screenshot, /pdf, and /html endpoints accept file:// URLs, allowing attackers to read arbitrary files from the server filesystem. Attack Vector: POST /execute_js { "url": "file:///etc/passwd", "scripts": ["document.body.innerText"] } Impact An unauthenticated attacker can: Read sensitive files (/etc/passwd, /etc/shadow, application configs) Access environment variables via /proc/self/environ Discover internal application structure Potentially read credentials and API keys Workarounds Disable …
A local file inclusion vulnerability exists in the Crawl4AI Docker API. The /execute_js, /screenshot, /pdf, and /html endpoints accept file:// URLs, allowing attackers to read arbitrary files from the server filesystem. Attack Vector: POST /execute_js { "url": "file:///etc/passwd", "scripts": ["document.body.innerText"] } Impact An unauthenticated attacker can: Read sensitive files (/etc/passwd, /etc/shadow, application configs) Access environment variables via /proc/self/environ Discover internal application structure Potentially read credentials and API keys Workarounds Disable …