Article API

The Article API automatically extracts clean text from news articles and blog posts—returning normalized HTML and plaintext, author and date information, related images/videos and more from any article on any site.

Features & Benefits Compare Diffbot to the competition
it's the best Diffbot's Article API has been the overwhelming winner in quality shootouts since 2011. Compare text-extraction methods.
Fully automatic Like all of Diffbot's Automatic APIs, the Article API needs no rules or training. Send it any text-heavy page and let Diffbot do the rest.
Works in any language Thanks to its basis in computer vision, the Article API extracts clean text in any language.
Native Text analysis Topics/tags are automatically generated for each analyzed article, and built-in sentiment analysis automatically analyzes each individual post to rate its overall positivity/negativity.
comments too Diffbot's Discussion API technology is built-in to the Article API to automatically extract comments alongside the main article text.
crawl them all Pair the Article API with Crawlbot to automatically identify and extract all articles across an entire site.