Fire jobs. Get a clean callback.

Queue a hundred URLs or a hundred thousand. Walk away. Your endpoint receives a single signed POST when the work completes. No polling, no held connections, no retry plumbing on your side.

# Send a batch of 100,000 URLs, walk away.
curl -X POST https://api.datasonar.dev/v1/scrape/batch/async \
  -H "Authorization: Bearer osk_..." \
  -d '{
    "urls": [...],
    "format": "markdown",
    "webhook_url": "https://yourapp.com/datasonar-callback"
  }'

# Response: { "status": "queued", "job_id": "..." }
# We POST the full result to your URL when ready.

Fire-and-forget at any size

Queue a job of any size — a hundred URLs or a hundred thousand — and walk away. Your server gets a single clean POST when the work is done. No long-held connections, no polling loops, no retry plumbing on your side.

Every webhook is signed

We sign every callback with a cryptographic signature derived from your account's webhook secret. Your endpoint verifies the signature before trusting the payload. Forgery becomes infeasible; replay attacks become traceable.

Automatic retries with backoff

If your endpoint returns a non-2xx response, we retry with exponential backoff — up to five attempts over 24 hours. Transient outages on your side don't lose data. Permanent failures land in a dead-letter inspection view on the dashboard.

Replay from the dashboard

Every webhook delivery is logged with the request payload, your response code, and the delivery latency. Replay any delivery from the dashboard with one click — useful for testing changes to your endpoint without re-running the original job.

Production patterns

Massive corpus build for LLM training

Queue 500,000 URLs, get a webhook when the batch finishes. No infrastructure to keep open, no progress dashboards to babysit.

Overnight competitor crawls

Schedule full-site crawls of 20 competitors. The webhook delivers each completed crawl to your ingestion pipeline as it finishes — no polling required.

Pipeline integration

Wire webhooks directly into Zapier, n8n, Pipedream, or your custom event bus. Treat scrape jobs as another event source in your existing automation stack.

Mobile and serverless backends

Apps that can't hold long-lived connections — mobile clients, edge functions, serverless workers — fire async jobs and receive results when ready. Works around platform timeout limits cleanly.

Webhook questions

Which endpoints support webhooks?
/v1/scrape/async, /v1/scrape/batch/async, and /v1/crawl. All return a job_id immediately and POST to your webhook_url when the work completes.
How are webhooks signed?
Each request carries an X-Datasonar-Signature header containing a cryptographic signature of the raw body, derived from your account's webhook secret. Verification code samples are above — compare the signature in constant time before trusting the payload.
Where do I get my webhook secret?
From the dashboard under Settings → Webhooks. Each account has one secret per environment (production and sandbox). Rotate it at any time; old signatures remain valid for 24 hours after rotation so in-flight callbacks complete cleanly.
What's the retry policy?
If your endpoint returns 5xx or times out, we retry with exponential backoff: 30 seconds, 2 minutes, 10 minutes, 1 hour, 6 hours, 24 hours. After five attempts we mark the delivery as failed and surface it in the dashboard for manual replay.
How long is the timeout for my endpoint?
30 seconds. If your endpoint hasn't returned a status code in 30 seconds, we treat it as a failure and queue a retry. For long downstream processing, acknowledge fast with 200 and process asynchronously on your side.
Can I use webhooks without HTTPS?
No. Webhook URLs must be HTTPS. We reject configuration of plain HTTP endpoints at job-creation time.
Do you also offer polling instead of webhooks?
Yes. If a webhook is impractical for your environment, omit webhook_url and poll /v1/jobs/{id} for completion. Webhooks are recommended for production because they eliminate the constant polling overhead.
What payload shape do webhooks deliver?
{ job_id, status, result }. The result block matches what you would get from the corresponding sync endpoint. The job_id is also echoed in the X-Datasonar-Job-Id header for fast routing.

Ship production-grade async today.

Webhooks included on every plan, including the free tier.