Products

Information Wants to Be Structured

Using AI, computer vision, machine learning and natural language processing, Diffbot provides developers numerous tools to understand and extract from any web page.

Automatic APIs
Automatically extract content from articles, products and image pages
Conventional web pages
{
   "name": "Automatic APIs",
   "type": "computer vision",
   "author": "Diffy",
   "target": "common web pages"
},
Custom API Toolkit
Extract data from any web page using easy-to-manage custom rules
Any and all web pages
{
   "name": "Custom API Toolkit",
   "type": "custom extraction",
   "author": "Diffy",
   "target": "any kind of page"
},
Crawlbot / Bulk API
Spider entire sites using any Diffbot API, and search the structured output
Entire domain
{
   "name": "Crawlbot",
   "type": "spidering",
   "author": "Diffy",
   "target": "entire domains"
}