For Immediate Release: Mar 31, 2015

Diffbot Discussions API Lets Companies and Developers Unearth and Track Millions of Hidden Mentions in Forums, Comments, and Reviews

Diffbot uses artificial intelligence robot to visually extract the Web’s dark matter - user-generated content currently hidden from search

PALO ALTO-- Diffbot, artificial intelligence startup and creators of visual learning robot technology that lets developers analyze, extract, and enhance Web content, today broke new ground in its quest to bring structure to the unstructured Web. Nearly a quarter of all Internet users participate in online communities and forums, while even more comment on articles or write online reviews. With the release of the company’s new "Discussions API", Diffbot is, for the first time ever, providing developers and companies with a tool to unlock the millions of comments made every day online in these formerly hidden corners of the Web. This user-generated content makes up a part of what is known as the "deep web," estimated to be 400 times larger than the "surface Web" that is indexed and accessible through traditional search engines. The Discussion API currently supports Facebook Comments, Disqus, Livefyre, Wordpress, Blogger, Intense Debate (owned by Automattic), Kinja, Hacker News, Reddit, and more.

"This is the holy grail of brand monitoring," said Mike Tung, CEO of Diffbot. “Traditional media monitoring tools track Twitter and Facebook, or editorial content. However, the substantive conversation about a brand’s products and services by actual customers is happening in the more specialized forums and review sites of the web.”

This new functionality visually analyzes web pages and instantly parses complete comment data, author information, topic analysis, and more into discrete objects. That structured data, which was never before accessible via automation, provides on-demand access allowing developers and companies to build applications that can:

  • Monitor brands, products or other keywords—to gauge user reaction, gather feedback or monitor sentiment—in the locations where users are actually providing feedback
  • Make forum and other user-created content mobile friendly for easier consumption on phones, tablets or other devices
  • Analyze user-created posts to identify trends or sentiment within user communities and among power users
  • Identify links and other content shared within discussion threads or comment sections to improve product recommendations, insert affiliate linking, and perform other link analysis
  • Completely process and extract an entire site’s worth of content—whether for migration or other analysis—when paired with Crawlbot, Diffbot’s intelligent spider

"Our new APIs allow developers to treat forums, comment threads and review collections as virtual databases, accessing their data on-the-fly and making this massive component of the web newly usable," continued Tung. "It will even help developers find those, admittedly rare, useful and constructive YouTube comments."

This new API augments Diffbot’s efforts to structure the data of the entire Web, including its APIs for automatic article, product, image and video extraction; its Analyze API, which immediately determines the “page type” of any unknown link; and its Crawlbot crawling platform for entire-site extraction.

About Diffbot:

Diffbot is a robot that examines the Web using computer vision and natural language processing, and provides developers with robust tools to find, extract and understand the objects from any Web page for use in their applications. Thousands of developers and businesses rely on Diffbot APIs to create consumer-friendly applications that use visual interpretation of the Web to re-imagine search, the mobile web and hundreds of other consumer applications. Customers include Adobe, CBS Interactive, Cisco, eBay, Instapaper, Salesforce, Samsung, StumbleUpon. It is based in Palo Alto, CA.

To learn more visit