u/InternationalLow9740

▲ 19 r/hubspot

We went from 1% to 94% company enrichment coverage in HubSpot without spending $20K on a vendor

We have around 130,000 company records in HubSpot. Our Clearbit enrichment coverage was sitting at 1%, because enrichment requires a populated company domain, and most of our companies didn't have one.

We got vendor quotes in the $15K-$25K range. Before signing anything, we asked whether we already had the data somewhere. Turns out we did.

Companies without a domain still had associated contacts, and majority of those contacts had valid work email addresses. We just weren't using them.

What we built

A HubSpot company workflow with a custom code action. The workflow enrolls any company where our custom "Extracted Domain" property is blank, and unenrolls as soon as it gets a value. The property is write-once, so if something's already there, the code skips it immediately and never overwrites.

The logic runs in this order:

  • Pull primary contacts only. We fetch all associated contacts in one pass using HubSpot's v4 associations API, which returns association labels in the same response. We filter to contacts labeled "Contact with Primary Company." Worth noting, HubSpot's own Primary Company contact property is unreliable for this. The association label filter is the right approach.
  • Batch fetch contact details. We pull properties for qualifying contacts in batches of 100 via the batch read endpoint. Fields we care about: email, deal count, owner, activity count, and a custom Record Type field we use internally.
  • Filter out noise. Before touching a domain, we exclude contacts with no email, contacts flagged as phone integration records (not real leads), generic personal email domains (Gmail, Yahoo, Outlook, etc.), and any internal company domains.
  • Find the winning domain. Among what's left, we find the most frequently occurring domain. When there's a tie, we break it with a weighted score that prioritizes contacts with associated deals first, contacts owned by a real rep second, and general activity third.
  • Write the output. The winning domain goes into the "Extracted Domain" property. A confidence score (the winning domain's share of qualifying contacts) goes into a separate "Domain Confidence" property.

A second workflow picks up from there. Once a company has an Extracted Domain, a confidence score above 70%, and Company Domain Name is unknown, it copies that domain into the native Company Domain Name property, which fires Clearbit enrichment automatically (when setup). Anything below 70% gets flagged for manual review instead.

Results

About 41,000 companies went through the full process. Enrichment coverage on those companies went from 1% to 94%.

What it unlocked

  • Clearbit now has enough data to actually do its job at scale
  • ZoomInfo credits are being used to fill gaps Clearbit can't cover, instead of duplicating work
  • Our lead scoring model finally has enough clean firmographic data to work with
reddit.com
u/InternationalLow9740 — 6 days ago