Docs · Crawling

Full-site crawl

Walk the link graph of an entire site.

POST /v1/crawl

Queues a site-wide crawl with depth, budget, and concurrency controls. Returns a job_id; results delivered via webhook or polling.

Parameters

Name Type Required Default Description
url string yes Seed URL.
max_pages integer no 500 Maximum pages to crawl.
depth integer no 3 Maximum link depth from seed.
concurrency integer no 10 Parallel page fetches.
respect_robots boolean no true Honor robots.txt.
webhook_url string no URL for completion callback.

Request

curl -X POST https://api.datasonar.dev/v1/crawl \
  -H "Authorization: Bearer osk_..." \
  -d '{"url": "https://docs.example.com", "max_pages": 500}'

Response

{
  "status": "queued",
  "job_id": "...",
  "message": "Crawl queued successfully."
}

Related