Q: How does webhook delivery work?

Provide a webhook_url with the request. When the job completes, we send a single POST to your URL with the full result body and a header containing the job id. Webhooks are signed so you can verify the origin.

Question 1

How big can a crawl be?

Accepted Answer

The default cap is 500 pages and depth 3. You can raise both — production customers regularly crawl tens of thousands of pages per job. For million-page crawls, talk to us about enterprise capacity.

Question 2

Does the crawler respect robots.txt?

Accepted Answer

Yes by default. Each crawl honors the robots.txt of the target site. You can override with the respect_robots: false flag in cases where you have explicit permission to crawl, such as your own site or a partner's.

Question 3

Can I scope the crawl to a single subdomain?

Accepted Answer

Yes. By default the crawler stays within the same host as the seed URL. Pass same_host: false to follow links across subdomains, or use include_patterns and exclude_patterns for finer control.

Question 4

How does webhook delivery work?

Accepted Answer

Provide a webhook_url with the request. When the job completes, we send a single POST to your URL with the full result body and a header containing the job id. Webhooks are signed so you can verify the origin.

Question 5

What happens to a crawl if I hit my monthly quota midway?

Accepted Answer

The crawler pauses and returns a partial result with everything collected so far plus a quota_exceeded flag. You can upgrade your plan and resume the job with the same job id.

Question 6

How fast is a crawl?

Accepted Answer

Throughput depends on target site responsiveness, concurrency, and whether pages need JavaScript rendering. A typical documentation site crawls at 5-15 pages per second; aggressive crawling against single-server sites is automatically slowed to be polite.

Crawl whole sites.
In one job.

Use cases for site-wide crawling

LLM training corpora

Competitor monitoring

Internal search indexing

Archival and compliance

Crawler questions

Start your first crawl free.

Crawl whole sites. In one job.