Fire jobs. Get a clean callback.
Queue a hundred URLs or a hundred thousand. Walk away. Your endpoint receives a single signed POST when the work completes. No polling, no held connections, no retry plumbing on your side.
# Send a batch of 100,000 URLs, walk away.
curl -X POST https://api.datasonar.dev/v1/scrape/batch/async \
-H "Authorization: Bearer osk_..." \
-d '{
"urls": [...],
"format": "markdown",
"webhook_url": "https://yourapp.com/datasonar-callback"
}'
# Response: { "status": "queued", "job_id": "..." }
# We POST the full result to your URL when ready.Fire-and-forget at any size
Queue a job of any size — a hundred URLs or a hundred thousand — and walk away. Your server gets a single clean POST when the work is done. No long-held connections, no polling loops, no retry plumbing on your side.
Every webhook is signed
We sign every callback with a cryptographic signature derived from your account's webhook secret. Your endpoint verifies the signature before trusting the payload. Forgery becomes infeasible; replay attacks become traceable.
Automatic retries with backoff
If your endpoint returns a non-2xx response, we retry with exponential backoff — up to five attempts over 24 hours. Transient outages on your side don't lose data. Permanent failures land in a dead-letter inspection view on the dashboard.
Replay from the dashboard
Every webhook delivery is logged with the request payload, your response code, and the delivery latency. Replay any delivery from the dashboard with one click — useful for testing changes to your endpoint without re-running the original job.
Production patterns
Massive corpus build for LLM training
Queue 500,000 URLs, get a webhook when the batch finishes. No infrastructure to keep open, no progress dashboards to babysit.
Overnight competitor crawls
Schedule full-site crawls of 20 competitors. The webhook delivers each completed crawl to your ingestion pipeline as it finishes — no polling required.
Pipeline integration
Wire webhooks directly into Zapier, n8n, Pipedream, or your custom event bus. Treat scrape jobs as another event source in your existing automation stack.
Mobile and serverless backends
Apps that can't hold long-lived connections — mobile clients, edge functions, serverless workers — fire async jobs and receive results when ready. Works around platform timeout limits cleanly.
Webhook questions
Which endpoints support webhooks? ▾
/v1/scrape/async, /v1/scrape/batch/async, and /v1/crawl. All return a job_id immediately and POST to your webhook_url when the work completes.How are webhooks signed? ▾
X-Datasonar-Signature header containing a cryptographic signature of the raw body, derived from your account's webhook secret. Verification code samples are above — compare the signature in constant time before trusting the payload.Where do I get my webhook secret? ▾
What's the retry policy? ▾
How long is the timeout for my endpoint? ▾
Can I use webhooks without HTTPS? ▾
Do you also offer polling instead of webhooks? ▾
webhook_url and poll /v1/jobs/{id} for completion. Webhooks are recommended for production because they eliminate the constant polling overhead.What payload shape do webhooks deliver? ▾
{ job_id, status, result }. The result block matches what you would get from the corresponding sync endpoint. The job_id is also echoed in the X-Datasonar-Job-Id header for fast routing.Ship production-grade async today.
Webhooks included on every plan, including the free tier.