DataSonar API is live in production

Clean structured data
from any site.

One API key, one endpoint pattern, every major site you care about — plus a built-in intelligence layer that returns the WHOIS, SSL, tech stack, and contacts behind every domain in the same call.

No credit card required. 1,000 requests free, every month.

Outcomes, not parsing

Send a URL, get the data — markdown for LLMs, JSON-LD for analytics, vertical-specific schemas for Amazon, Zillow, and Google Maps.

The whole web, indexed

One endpoint family covers every site. Smart routing skips the browser when a page does not need it, so static pages return in milliseconds.

Intelligence in every call

DNS, WHOIS, SSL, technology stack, email verification, and contact extraction — all together, all in JSON.

27+
Endpoints in production
4
Verticals covered
< 800 ms
Average response time
1,000
Free monthly requests

Real call. Real response.

This is an actual response from POST /v1/intel/page against a public domain. No screenshots, no mock data.

Request
POST https://api.datasonar.dev/v1/intel/page
Authorization: Bearer osk_…
Content-Type: application/json

{
  "url": "https://anthropic.com"
}
Response · 402 ms
{
  "status": "success",
  "url": "https://anthropic.com",
  "tech_stack": ["AWS", "Cloudflare"],
  "contacts": {
    "socials": { "twitter": "…", "linkedin": "…" },
    "emails": [],
    "phones": []
  },
  "logos": [
    { "type": "icon", "src": "/favicon.ico" },
    { "type": "apple-touch", "src": "/apple-icon.png" }
  ],
  "feeds": [],
  "time_ms": 402
}
DNS intelligence
email_provider: "Google Workspace"
nameserver_provider: "Cloudflare"
has_spf: true
has_dmarc: true
SSL certificate
issuer: "Let's Encrypt"
days_remaining: 51
sans: ["anthropic.com",
       "console.anthropic.com"]
WHOIS
registrar: "MarkMonitor, Inc."
created: "2001-10-02"
expires: "2033-10-02"
nameservers: ["isla.ns.cloudflare.com"]

First call in five minutes.

Get your API key from the dashboard, hit the endpoint, get clean data. The hardest part is choosing a URL.

curl -X POST https://api.datasonar.dev/v1/scrape \
  -H "Authorization: Bearer osk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "format": "markdown",
    "stealth": true
  }'

One API. Every job.

From a single page to a full site, from raw HTML to structured product data — the same key, the same patterns, everywhere.

Scraping core

  • POST /v1/scrapeSingle URL to markdown, HTML, text, or links.
  • POST /v1/scrape/smartAuto-route static pages to fast HTTP; dynamic pages to a full browser.
  • POST /v1/scrape/batchUp to 100 URLs in parallel, one response.
  • POST /v1/scrape/asyncQueue large jobs and receive a webhook when ready.
  • GET /v1/jobs/{id}Poll status of any queued job.

Extraction & parsing

  • POST /v1/extract/cleanArticle body with no nav, ads, or sidebars. Reading time and word count included.
  • POST /v1/extract/structuredJSON-LD, Microdata, OpenGraph, and Twitter Card in one payload.
  • POST /v1/actors/markdownClean, LLM-ready markdown for any page.

Domain intelligence

  • POST /v1/dns/lookupA, MX, TXT, NS, CNAME, AAAA records.
  • POST /v1/dns/intelligenceEmail provider, nameserver provider, detected technologies.
  • POST /v1/intel/sslIssuer, expiry, SANs, days remaining.
  • POST /v1/intel/whoisRegistrar, creation date, expiry date, contact emails.
  • POST /v1/intel/pageTech stack, social links, emails, phones, logos, feeds.
  • POST /v1/verify/emailSMTP handshake, MX check, catch-all and disposable detection.

Crawling & verticals

  • POST /v1/crawlFull-site crawler with depth, budget, and robots.txt controls.
  • POST /v1/intel/sitemapUnroll any sitemap.xml, including nested index sitemaps.
  • POST /v1/intel/robotsCheck whether a path is allowed for a given user agent.
  • POST /v1/actors/amazonProduct title, price, rating, reviews, ASIN, availability.
  • POST /v1/actors/zillowAddress, price, zestimate, beds, baths, square feet, year built.
  • POST /v1/actors/google-mapsPlace name, rating, reviews, hours, address, coordinates.

Frequently asked questions

What kinds of sites can DataSonar handle?
DataSonar handles static pages, JavaScript-heavy single page applications, e-commerce sites, real estate listings, maps, and most content behind anti-bot defenses. For sites with the strongest protection — Amazon, Zillow, certain travel and ticketing sites — pairing your request with a residential proxy improves success rates significantly.
How is DataSonar different from Apify, ScraperAPI, or Bright Data?
Apify pioneered the marketplace model for scrapers and ScraperAPI nailed developer ergonomics. DataSonar takes the same simplicity and goes further on two fronts: every extractor is maintained in-house so reliability stays consistent across the catalog, and a complete domain intelligence layer — DNS, WHOIS, SSL, tech stack, contacts — ships in the same API at no extra cost.
Do you charge for failed requests?
No. Failed requests do not count against your monthly quota. You only pay for successful responses.
What does the free tier include?
1,000 requests every month, every endpoint, no credit card. Plenty to evaluate the API end to end and run real prototypes.
How do I scrape Amazon or Zillow specifically?
Use the dedicated actor endpoints — /v1/actors/amazon and /v1/actors/zillow. They return structured product or property data in a clean JSON schema. Both sites use aggressive anti-bot defenses, so we recommend supplying a residential proxy parameter for production workloads.
Can DataSonar scrape pages that require login?
Yes, via the actions array on /v1/scrape. You can drive clicks, typing, waits, and form submissions natively. For pages that need persistent sessions, talk to us about the enterprise tier.
What is the average response time?
Static pages typically return in under 800 milliseconds. JavaScript-rendered pages take 1.5 to 4 seconds depending on complexity. Domain intelligence calls return in 50 to 500 milliseconds.
Is the API rate-limited?
Yes. Each plan has a requests-per-minute ceiling and a monthly request quota. Rate limit headers are returned with every response so your client can adapt in real time.

Start pulling clean data in minutes.

1,000 requests free every month. No credit card required.