Blog
In-depth writing on the open web — for AI teams, RevOps, and data engineers.
2026-05-16 • 14 min read • AI, RAG, training data
Building a web data pipeline for LLM training in 2026
A practical guide to collecting, cleaning, and shipping training data at scale — what works, what fails, and what to outsource.
2026-05-16 • 12 min read • sales intelligence, technographic, RevOps
Sales intelligence APIs in 2026: a buyer's guide to DNS, WHOIS, and technographic data
What technographic data really is, what it isn't, and how to pick between BuiltWith, ZoomInfo, Clearbit, and the new wave of API-first providers.
2026-05-16 • 11 min read • real estate, Zillow, anti-bot
Scraping Zillow in 2026: what works, what fails, what to do about it
An honest look at the bot defenses, residential proxies, embedded payload extraction, and the three working strategies for getting Zillow data into a production pipeline.