Use our Analyze API to automatically find and extract all article, product or other supported pages.
All crawls are instantly searchable using our Search API, allowing you to slice and dice your data by searching the Diffbot-structured fields. Sort by article date, filter by product price, search across your custom fields: it's all in there.
Distributed, world-class crawling infrastructure processing millions of pages daily.
Utilize our reserved fleet of proxy IPs, optionally upgrade to gain access to tens of thousands of unique IPs for truly diversified crawling or region/country-specific extraction.
If you prefer knowbs and dials, carefully control which pages you crawl and extract.
Programmatically start crawls, check crawl statuses, and retrieve output using the Crawlbot API.
Pair a Custom API with Crawlbot to extract nearly anything from any site.
Re-run, copy, re-download or simply review your crawl history at any time.
Crawlbot Basics (4:56)
In this introductory Crawlbot video we work through how to set up a basic crawl to extract product data from across an ecommerce site.
Advanced Crawlbot (5:00)
In this video we look at some of the more advanced techniques available using Crawlbot, including crawling pages that are behind logins.