Docs · Scraping
Scrape a single URL
Fetch one URL, get back clean structured data in your chosen format.
POST
/v1/scrape The core scraping endpoint. Handles JavaScript-rendered pages with a stealth browser. Returns the content in your chosen format along with metadata.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| url | string | yes | — | URL to fetch. |
| format | string | no | markdown | One of markdown, html, text, links, original. |
| stealth | boolean | no | true | Apply stealth countermeasures to avoid bot detection. |
| timeout | integer | no | 30 | Per-request timeout in seconds (max 120). |
| actions | array | no | — | Action macro: clicks, typing, waits, scrolls before extraction. |
| js_eval | string | no | — | JavaScript expression to evaluate after page load. |
| wait_until | string | no | load | load, domcontentloaded, or networkidle0. |
| proxy | string | no | — | Proxy URL for the request (HTTP or SOCKS). |
| user_agent | string | no | — | Override the default User-Agent string. |
| solve_captcha | boolean | no | false | Auto-detect and solve reCAPTCHA v3 on the page. |
Request
curl -X POST https://api.datasonar.dev/v1/scrape \
-H "Authorization: Bearer osk_..." \
-d '{"url": "https://example.com", "format": "markdown"}' Response
{
"status": "success",
"url": "https://example.com",
"content": "# Example Domain\n\nThis domain is for use in...",
"format": "markdown",
"time_ms": 412
}