Our Commitment to Your Privacy

Updated: April 10, 2023

See also: Diffbot’s Terms of Use and Cookie Policy

This privacy policy applies to https://www.diffbot.com (the Site), operated by Diffbot Technologies Corp. (“Diffbot,” “us,” “we” or “our”).

The Site provides the “Services” as defined in the Terms of Use. This Privacy Policy explains how we collect and use the personal data of “Visitors” to our site (people who visit our site), of “Subscribers” (users of our Services) and of “Search Subjects” (people who have published personal data about themselves to publicly available web sites and web services on the Internet which are the subject of our APIs and indexing services).


Diffbot is a service that uses bots, algorithms, computer vision and artificial intelligence (AI) to process the content on the Internet thereby allowing websites to be broken down into different page types and the pages to be identified. Diffbot’s technology differentiates between a social network profile, a blog post, a website’s front page, a product page, an event page, and more. Diffbot’s machine learning and computer vision algorithms and APIs (application program interfaces) are used by Subscribers of our services to facilitate the discovery, identification, and extraction of data content from publicly accessible and available websites and services.

More details about the Diffbot products, and services offered can be found on our website at https://www.diffbot.com.

For individuals residing in the EEA, the UK, or Switzerland, please go here to find out more information.

For residents of the State of California, please go here to find out more about your rights under the California Consumer Privacy Act of 2018 (“CCPA”).

Our Contact information is:

Diffbot Technologies Corp.

333 Ravenswood Ave

Menlo Park, CA 94025

PHONE: 1 (855) 885-4800

EMAIL: privacy@diffbot.com

If you have any questions about this Privacy Policy or how your personal data is used, please contact

Data Protection Officer

Diffbot Technologies Corp.

333 Ravenswood Ave

Menlo Park, CA 94025

Phone Number: 1-855-885-4800 Email: privacy@diffbot.com



When a Visitor registers to trial our Service, the Visitor becomes a Subscriber of our services for a trial period. We collect the following data in order to provide the Trial services:

First Name

Last Name

Email Address

Company Name

Telephone Number

Information collected from cookies

We also issue each Subscriber a unique API key in order to access the Services during the trial period.

During use of the Services (for Trial Users and Subscribers) we collect and use the following data and associate it with your API key in order to provide you with our services. This includes:

Information collected from cookies


Query history

API calls


When a Subscriber commits to continue use of our Services on an ongoing basis following the trial period by purchasing our Services, we collect the same information as we do for a trial subscription, plus our payments processor collects Credit Card Information to facilitate monthly subscription payments.


We collect non-personal information (i.e. information that on its own cannot be used to personally identify you as an individual) that includes:

anonymous usage data

referring/exit pages and URLs

platform types

IP Address

Our Site or Service may not be available in all areas.


We crawl the public web to collect and use different kinds of data, including data about Search Subjects that are publicly available. We allow our Subscribers to engage with the data supplied by Search Subjects in a strategic, meaningful, and targeted manner, to the extent it is available, and in accordance with the instructions of our Subscribers. This information includes but is not limited to:



Job Title

Publicly available biographical information disclosed on webpages

Education History

Employment History

Work address

Work Telephone Number

Email Address

Public web page URLs and handles

Subject area expertise

We do not collect sensitive personal data about a Search Subject’s (race, ethnicity, religious or philosophical beliefs, sexual preferences or orientation, political opinions or memberships, trade union memberships, health information; genetic or biometric data) unless the Search Subject published such data to the public domain him/herself (i.e. you have made such information available to the public by writing about it or posting it on your social media profile(s) and/or website).



The data collected from Subscribers (for both Trial and Ongoing usage) is provided by the subscribers at the time of sign up in order for us to provide the service. Usage data is collected whenever Subscribers use the services.


Diffbot may collect information automatically from our Subscribers using web tracking technologies such as cookies, web beacons, pixel tags, clear GIFs, and third party tracking services in order to ensure that web applications and Services operate efficiently and to collect data related to usage of a web application or Service such as, but not limited to, the browser type, language preference, referring site, and the date and time of each visitor request (“Tracking Information”).



We use both session-based and persistent cookies.

Session-based cookies last only while your browser is open and are automatically deleted when you close your browser.

Persistent cookies last until you or your browser delete them or until they expire. They are unique and allow us to collect site analytics and to customize a Subscriber’s experience. If you access our Site through your browser, you can manage your cookie settings.

In order to collect Tracking Information, make your use of the Site and Service more efficient and responsive to your needs, Diffbot and its cookie service providers, detailed in the Cookie Policy, store cookies on your computer. Diffbot also uses cookies that are placed in web pages on the Site(s) and Services to collect information and learn about actions users take when they interact with the Site and Service.

Diffbot does not link Tracking Information to individual user Personal Information; nor does it include the Personal Information with the Tracking Information that Diffbot shares with the web tracking companies that use and process the Tracking Information, except as strictly necessary to provide and improve the Services (including customer support services). Some Tracking Information may include log or other data, such as IP address data, that is unique to you. You may be able to modify your browser settings to alter which web tracking technologies are permitted when you use the Site(s) and Services, but this may affect the performance of the Sites and Services.

If you do not wish to receive cookies, you may deactivate storing cookies on your computer by changing your browser settings accordingly. Please note that the functionality of the Site(s) and Services may be impaired, and the range of functionalities may be severely limited if you deactivate cookies.

We do not use cookies, beacons or pixels on our public website, www.diffbot.com.

Specifics of which cookies we use and where we use them can be found on our Cookie Policy page.


The information we hold about Individuals is obtained from crawling publicly available sources on the Internet. This includes publicly available social media profiles and websites; information from articles the Search Subject may have written or was written about the Search Subject on the internet. This information is gathered by our technology using automation and machine learning algorithms.

We collect data directly from publicly available sources, including but not limited to:

Articles or blogs you may have written or may have written about you in the public domain

Vlogs you may have created in the public domain

Your own website

Third-party website(s)



Subscriber data is used to provide the Service; process Service payments; facilitate support queries and requests. A Subscriber’s failure to provide the personal data we need, may result in our inability to complete the transaction or provide the service.


We do not control what information is collected by Subscribers or their purpose for use. To the extent any notices or consents are required, Subscribers are solely responsible for giving such notices or obtaining such consents. We do not use your personal data for activities where we believe that your interests are overridden by any unwarranted adverse impact on you.

Diffbot’s primary role is to discover, categorize, and organize information on the Internet on behalf of, and as directed by, our Subscriber(s). Diffbot subscribers can use our service and query functionality to create their own lists of Search Subjects; add their own data about Search Subjects (which only they can access); and monitor Search Subjects.

We allow our Subscribers access to search results via our online platform(s) and/or API service(s) to enable our subscribers to view information the Subscribers may be interested in.

A Subscriber can submit a query using the Diffbot UI or API to access data collected and indexed from the Internet. Diffbot then returns a list of items or subjects and a custom data set is created (a list of entities, related entities and facts) that matches the Subscriber’s search criteria. A Subscriber may also define the data to extract from a specific web page, a group of pages or domain by accessing a suite of extraction APIs, by extending an existing extraction API, or by creating a new API using custom rules which yields extracted data.

Please see Profiling and Tracking to see how your data is used for profiling and tracking.


Aggregation of Search Subject data is beneficial to you because it will allow the Subscriber to receive targeted information that may be of interest or use for due diligence, job search or other evaluation or research purposes;.


For Subscribers, we use the Personal Data of Subscribers (i.e. login; passwords, API tokens) to validate their right to use our Services.


We will use Personal Data of Search Subjects to validate their identities in connection with the exercise of an EU, CCPA (or other applicable jurisdiction’s) data subject’s rights or in the case of non-EU, non-CCPA data subjects not subject to such legally required rights, to validate their identities prior to processing any removal or update requests.


Customers can opt out of receiving marketing materials via email or mail at any time while receiving the Service. If you receive email, newsletter or marketing communications from us and no longer wish to receive them, please follow the removal instructions in the email or change your account settings.


We do not market to Search Subjects or commercially sell data not otherwise publicly available.



Subscriber tracking is used to improve the Services and Site and includes: length of use; time and date service was used; search queries; API requests; and how the service was accessed (app; desktop; phone).


We and third-party service providers may collect certain tracking information about your personal data for automated aggregation, indexing, and/or categorization purposes. The aggregation/categorization/indexing is limited and collected from the public domain; i.e. collected from information you publish on the internet and make public. For example, we catalog your subject matter expertise, interests, employments, education, and skills, and make this information available to our customers in response to Subscribers’ particular queries they make when using our service. While we do not have the means to contact the millions of Search Subjects indexed through crawling, we will honor applicable data subject rights requests subject to our confirmation of your identity.


We combine the personal data and information from publicly available websites and present this information to our Subscribers via our Service. Our Subscribers are able to search publicly accessible sources using APIs or a search dashboard. This allows our Customers to do due diligence on Individuals with whom they wish to build a business relationship or collect and use data for other legitimate purposes.

We also offer a service that uses APIs to structure hundreds or thousands of web pages into a single, searchable index. In doing so, the Subscriber may collect and create profiles containing Personal Data.


Our Subscribers can understand your subject matter expertise, areas of interest, services offered, or subjects you are interested in thereby allowing our Subscriber(s) to target Search Subjects or other search criteria.

View and Evaluate Your Content

Via our service, a Subscriber can search and categorize your content, which may be given an automatically generated “relevance” score or be categorized/labeled, by Diffbot’s technology; The Subscriber can extract data from the aggregated data that matches their query or create graphs based on the indexes they build.

Automated Decision-making

We use automated techniques such as visual layout analysis and classification, computer vision, text analytics, machine learning, knowledge fusion to identify and classify data, and extract and understand data on webpages which can be used to develop algorithms for even more robust automation. Diffbot does not use the output of automated data processing as the sole basis for any decisions regarding a Search Subject.


We will not sell, rent, or share your personal data with third parties outside of Diffbot without your consent, except in the following ways:

Subscribers (Applicable to Search Subjects): Our subscribers are typically businesses or institutions and come from all sectors. We share your data with our subscribers by allowing them access to extract data from the internet through our online services.


Law Enforcement and Internal Operations: Personal Data may be provided where we are required to do so by law, or if we believe in good faith that it is reasonably necessary (i) to respond to claims asserted against Diffbot or to comply with the legal process (for example, discovery requests, subpoenas or warrants); (ii) to enforce or administer our policies and agreements with users; (iii) for fraud prevention, risk assessment, investigation, customer support, product development and debugging purposes; or (iv) to protect the rights, property or safety of Diffbot, its users, employees or members of the general public. We will use commercially reasonable efforts to notify users about law enforcement or court ordered requests for data unless otherwise prohibited by law. However, nothing in this Privacy Policy is intended to limit any legal defenses or objections that you may have to any third party request to compel disclosure of your information.

Business Transfer: Diffbot may sell, transfer or otherwise share some or all of its assets, including your Personal Data, in connection with a merger, acquisition, reorganization or sale of assets or in the event of bankruptcy. Under such circumstances, Diffbot will use commercially reasonable efforts to notify its users if their personal information is to be disclosed or transferred and/or becomes subject to a different privacy policy.

Third Parties: We sometimes contract with other companies and individuals to perform functions or services on our behalf. Our categories of service providers include: software maintenance, data hosting, sending email messages, project management and customer service. We necessarily have to share your Personal Data with such third parties as may be required to perform their functions. We take steps to ensure that these parties take protecting your privacy as seriously as we do, including entering into Data Processing Addendums, EU Model Clauses and/or ensuring they have EU-U.S. and Swiss-US Privacy Shield certification since all of our service providers are in the United States.


Third Parties that collect and share Personal Data with us regarding Search Subjects, Trial Users, or Subscribers:



Plausible Analytics is a privacy-friendly analytics application used to help us understand visitor trends and the effectiveness of marketing outreach. Plausible does not collect personally identifiable information. Because Diffbot installs and runs an open source version of Plausible locally, no data processed by Plausible is ever sent to third party services.

For more details, please see Plausible’s data policy.


Amplitude is a cloud-based product-analytics platform that helps customers build better products. For more information visit their privacy policy at:https://amplitude.com/privacy.

Google Analytics

Google Analytics collects information such as how often users visit our site, what pages they visit when they do so, and what other sites they used prior to coming to this site. We use the information we get from Google Analytics only to improve this site, but in an anonymous form. Google Analytics collects only the IP address assigned to you on the date you visit this site and assigns a user ID code, rather than your name or other identifying information. We do not combine the information collected through the use of Google Analytics with personally identifiable information.

Google uses this information to analyze your use of the website, to generate reports about website activities for website operators and to provide further services related to website and internet use. Google may also share such information with third parties to the extent it is legally required to do so and/or to the extent third parties process data on behalf of Google. Although Google Analytics plants a permanent cookie on your web browser to identify you as a unique user the next time you visit this site, the cookie cannot be used by anyone but Google. Google’s ability to use and share information collected by Google Analytics about your visits to this site is restricted by the Google Analytics Terms of Use and the Google Privacy Policy and Data Processing Amendment. You can prevent Google Analytics from recognizing you on return visits to this site by disabling cookies on your browser. You may block Google Analytics on some browsers with the help of a browser add-on if you do not want us to use this website analysis.

This add-on can be downloaded at: http://tools.google.com/dlpage/gaoptout?hl=en. For more information on Google Analytics and Google’s privacy practices, please review their privacy policy at https://www.google.com/policies/privacy/


Hotjar services enable us to better understand our users’ needs and to optimize our service. Hotjar is a technology service provider that helps us better understand our users’ experience (e.g. how much time they spend on which pages, which links they choose to click, what users do and don’t like, etc.) and this enables us to build and maintain our service with user feedback. Hotjar uses cookies and other technologies to collect data on our users’ behavior and their devices. This includes a device’s IP address (processed during your session and stored in a de-identified form), device screen size, device type (unique device identifiers), browser information, geographic location (country only), and the preferred language used to display our website. Hotjar stores this information on our behalf in a pseudonymized user profile. Hotjar is contractually forbidden to sell any of the data collected on our behalf.

For further details, please see the ‘about Hotjar’ section of Hotjar’s support site.”


Segment.com software allows individuals and businesses to unify Customer Data from many sources and services into a single view. Segment.io is a data processor, self-certified under the US-EU Privacy Shield and the Swiss-U.S. Privacy Shield framework to process data in the United States and data is only shared subject to a Data Protection Addendum For more information, go to their privacy policy at: https://segment.com/legal/privacy/.



Mailchimp is an online marketing platform that facilitates transactional and marketing email services for businesses. Mailchimp is self-certified under the US-EU Privacy Shield and the Swiss-U.S. Privacy Shield framework to process data in the United States and data is only shared subject to a Data Protection Addendum. For more information please visit their privacy policy at: https://mailchimp.com/legal/privacy/.


MixMax is an online marketing platform. Mixmax provides robust tools and analytics to enhance users’ outbound, electronic communications via Gmail and Google Inbox. Features include: analytics regarding open and click-through rates of recipients, automated calendar scheduling from within an email, and easy-to-use email templates. We use MixMax to facilitate outreach to trial users and subscribers. For more information please refer to: https://www.mixmax.com/legal/privacy-policy


Amazon S3

Amazon S3 is object storage built to store and retrieve any amount of data from anywhere – websites and mobile apps, corporate applications, and data from IoT sensors or devices. For more information visit their privacy policy at: https://aws.amazon.com/privacy/?nc1=f_pr.


ReadMe is a documentation hub and developer platform. For more information please refer to: https://docs.readme.com/main/docs/security-faq#-data-access


Chilipiper provides qualifying, routing, and booking software, used by B2B revenue teams. We use Chilipiper services to schedule demo sessions and book Sales meetings. For more information please refer to: https://www.chilipiper.com/privacy-policy


We use Intercom in connection with our to store and track usage statistics, support conversations and contact information such as name and email in connection with those live support chat conversations. Intercom is used for customer support purposes. In particular, we provide a limited amount of your information (such as sign-up date and some personal information like your email address) to Intercom, Inc. (“Intercom”) and utilize Intercom to collect data for analytics purposes when you visit our website or use our product. As a data processor acting on our behalf, Intercom analyzes your use of our website and/or product and tracks our relationship by way of cookies and similar technologies so that we can improve our service to you. For more information on Intercom’s use of cookies, please visit https://www.intercom.com/terms-and-policies#cookie-policy. We may also use Intercom as a medium for communications, either through email, or through messages within our product(s). As part of our service agreements, Intercom collects publicly available contact and social information related to you, such as your email address, gender, company, job title, photos, website URLs, social network handles and physical addresses, to enhance your user experience. Processing takes place in the United States. Intercom is self-certified under the US-EU Privacy Shield and we have entered into a Data Processing Addendum with them. For more information on the privacy practices of Intercom, please visit their privacy policy. Intercom’s services are governed by Intercom’s terms of use which can be found at https:http://www.intercom.com/terms-and-policies#terms .


ProfitWell is a B2B subscription revenue automation service provider. Services include reporting and analytics to optimize pricing, improve adoption rates, and reduce churn. ProfitWell does not sell or share subscriber data. For more information refer to their privacy policy: https://www.profitwell.com/privacy-policy.


Salesforce customer relationship management software (CRM) allows individuals and businesses to unify Prospect and Customer Data for the purposes of marketing and sale of products and services. Salesforce is a data processor, self-certified under the US-EU Privacy Shield and the Swiss-U.S. Privacy Shield framework to process data in the United States and data is only shared subject to a Data Protection Addendum For more information, go to their privacy policy at: https://www.salesforce.com/company/privacy/.


Slack is a communication and collaboration platform that collects chat information. Processing takes place in the United States and data is transferred subject to EU-US Privacy Shield Certified, Swiss-US Privacy Shield Certified policies. For more information go to their privacy policy at: https://slack.com/privacy-policy.



Stripe’s software allows individuals and businesses to receive payments over the Internet. Stripe provides the technical, fraud prevention, and banking infrastructure required to operate on-line payment systems. Stripe is a data processor, self-certified under the US-EU Privacy Shield and the Swiss-U.S. Privacy Shield framework to process data in the United States and data is only shared subject to a Data Protection Addendum. For more information, go to their privacy policy at: https://stripe.com/us/privacy.


If you are a California resident or a subject of the European Union (EU) or European Economic Area (EEA) and Switzerland, you are entitled to the full spectrum of the rights under the General Data Protection Regulation (GDPR) and we accommodate any valid request. Because we value privacy and your rights in your data, we also may offer similar choices to data subjects located in the United States or other countries, even though we are not required to by law. You can exercise your data subject rights by emailing our Data Protection Officers at privacy@diffbot.com.

For individuals residing in the EEA, the UK, or Switzerland, please go here to find out more information.

For residents of the State of California, please go here to find out more about your rights under the California Consumer Privacy Act of 2018 (“CCPA”).


Diffbot does not knowingly collect or solicit any information from anyone 18 years and younger. The Site and Service are not directed at nor made to appeal to such persons. Parents or guardians that believe that we hold information about their children aged 18 and under may contact us at privacy@diffbot.com to have their children’s information deleted from our records.


“Do Not Track” or DNT is a feature enabled on some browsers that sends a signal to request that a web application disable its tracking or cross-site user tracking. At present, our Sites do not respond to or alter its practices when a DNT signal is received.


If you are providing your Personal Information to us directly to use our Services, we will transmit your data, including your Personal Data, to the United States in order to fulfill our contractual obligations to you.


We have implemented reasonable administrative, technical and physical security measures to protect Visitor, Subscriber and Search Subject personal information against unauthorized access, destruction or alteration. For example:

SSL encryption (https) everywhere where we deal with personal data except API calls that rely on http without encryption. Data that is stored by us is kept on secure encrypted services, located in the US. Restricting staff access to personal data protected by password logins. Regular staff privacy and security training. Payments services are tokenized. However, because no security system can be 100% effective, we cannot completely guarantee the security of any information we store, process or transmit. We are committed to protecting your personal data. We put in place safeguards including robust and appropriate technologies, processes, and contractual arrangements, so that the data we have about you is protected from unauthorized access and improper use, and we will also not keep your personal data for longer than is necessary.


Diffbot utilizes only PCI-DSS compliant third-party payment processors to ensure the security of your personal information.


We will keep Subscriber Personnel and Search Subject personal data only for as long as is necessary for the purposes set out in this privacy notice and to fulfill our legal obligations, but not longer than 30 days after we become aware that you wish to stop receiving communications or sharing your data and have verified your identity. We will not keep more data than we need.


Diffbot reserves the right to amend this Privacy Policy at any time. If Diffbot makes material changes to its Privacy Policy, we will notify you by (1) changing the Effective Date on our Privacy Policy and providing additional notification either (i) via email or (ii) other means as we may deem commercially reasonable.


Our Services may involve the processing of Search Subject Personal Data on behalf of our Subscribers. When we do so, we are acting as processors for the controllers of such data. As such, we take steps to ensure that personal data subject to GDPR is processed in accordance with controller instructions and GDPR such as entering into a Data Processing Addendum(s) incorporating EU Standard Contractual Clauses governing the processing, transmission and use of such personal data.

If you wish to exercise your data subject rights to review, rectify, delete or port your personal data please contact the controller to make such request. If you make the request to us, we will work with the controller to process and evaluate such request to confirm whether deletion is required by GDPR.



California residents who have an established business relationship with Diffbot may make a written request to the Diffbot about whether the Diffbot has disclosed any Personal Information to any third parties for the third parties’ direct marketing purposes during the prior calendar year. To make such a request, please send an email to: privacy@diffbot.com or write us at:

c/o Data Protection Officer

Diffbot Technologies Corp.

333 Ravenswood Ave

Menlo Park, CA 94025

Phone Number: 1 (855) 885-4800

For residents of the State of California, please go here to find out more about your rights under the California Consumer Privacy Act of 2018 (“CCPA”).


We may link to other websites. When you click on one of these links, you are ‘clicking’ to another website. Diffbot does not control the data collection or privacy practices of such third party sites. We encourage you to read the privacy policies of any third party sites, as their collection, use and storage practices and policies may differ from ours.


If you ever have any questions about our online Privacy Policy, please contact us. We respect your rights and privacy, and will be happy to answer any questions or concerns you might have. You may direct any such questions to our Data Protection Officer at privacy@diffbot.com.