GuideHubSpotContacts

HubSpot duplicate contacts merge, without pretending HubSpot works like Salesforce

HubSpot duplicate cleanup has its own constraints. Contacts are mostly keyed by email. Companies lean on company domain. Record ID becomes the manual backstop for broader objects. And automatic contact-company association can still create a messy company layer when several companies share a domain or when API-created companies bypass native domain dedupe.

HubSpot already dedupes some paths natively

Contacts dedupe by email in key flows, and companies dedupe by company domain in some flows. That helps, but it does not cover every import, API write, or association edge case.

Auto-association can still create cleanup work

If multiple companies share a domain, HubSpot can auto-associate a contact to only one company and does not let you pick the winner automatically.

Gremlin is exact-key on HubSpot today

The shipped HubSpot CLI path groups duplicates by an exact key column, chooses a primary with a keep strategy, dry-runs by default, and applies merges with resumable state.

What objects the shipped client can merge

Contacts

HubSpot auto-deduplicates contacts by email, while Gremlin can merge contact duplicates via hubspot merge or hubspot merge-apply-plan.

Companies

HubSpot auto-deduplicates companies by company domain in some entry paths, but API-created companies are not deduplicated by domain. Gremlin can merge companies in the shipped CLI merge path.

Deals

HubSpot itself leans on Record ID or unique properties for deal dedupe; Gremlin merge_objects also supports deals in the shipped CLI.

Tickets

The shipped Gremlin merge client includes tickets, though HubSpot native dedupe for non-contact and non-company objects is mostly Record ID or unique-property based.

HubSpot-native constraints to keep in view

Email is the default contact key

If an import omits Email, HubSpot creates new contact rows instead of deduplicating them.

Company domain drives company dedupe

HubSpot uses Company domain name during manual, form, and import dedupe, but not for companies created through API or third-party sync apps.

Record ID is the manual backstop

Contacts, companies, deals, tickets, products, and custom objects can all be deduplicated by Record ID on import.

Auto-association can create ambiguity

When automatic company association is on, one email domain can map contacts to only one of several matching companies, and HubSpot does not let you choose which one automatically wins.

Gremlin parity is not the same as Salesforce

The shipped HubSpot CLI flow is exact-key planning and apply. I did not find Salesforce-style blocking-first cluster queues, receipt-file parity, or a verify command in the public HubSpot dedupe path.

Build the HubSpot plan

g-gremlin hubspot merge-plan contacts \
  --key-column email \
  --keep oldest-created \
  --out contacts-plan.json

Dry-run before apply

g-gremlin hubspot merge-apply-plan \
  --plan contacts-plan.json \
  --state-file contacts-state.json
g-gremlin hubspot merge-apply-plan \
  --plan contacts-plan.json \
  --state-file contacts-state.json \
  --apply

What to say honestly about the HubSpot story

The HubSpot mirror should keep the audit-first framing, but it should not pretend the shipped HubSpot flow has full parity with Salesforce dedupe. I did not find Salesforce-style blocking-first cluster review queues, receipt-file parity, or a verify command in the public HubSpot dedupe path.

What does ship is useful: exact-key merge planning, keep strategies, dry-run by default, single-pair merge, and resumable merge-plan apply. That is enough to talk about safe cleanup for contact and company duplicates without overselling the surface.

Important caveat

Dry-run is still the CLI default for HubSpot merge apply.
Do not claim Salesforce-style receipt and verify parity for HubSpot unless the product changes.

FAQ

Does HubSpot dedupe contacts natively?

Yes in important paths. HubSpot uses email to deduplicate contacts in manual creation, form, and import flows. But if you import contacts without Email, HubSpot creates new rows instead of deduplicating them. That is why contact cleanup still needs deliberate review.

Does HubSpot dedupe companies by domain?

Yes in several native flows, but not everywhere. HubSpot can deduplicate companies by Company domain name in manual, form, and import flows. Companies created through API or third-party sync apps are not deduplicated by domain, which is one reason duplicate company cleanup keeps showing up in real portals.

How does Gremlin handle HubSpot dedupe today?

The public HubSpot CLI workflow is exact-key planning and apply, not the same blocking-first cluster-review path that Salesforce dedupe uses. You build a merge plan by grouping on a key column or property, choose a primary with a keep strategy, dry-run merge-apply-plan by default, and then execute with --apply if the plan looks right.

Keep the conversation going

These pages are meant to help operators solve real problems. If you want the next guide, grab the low-friction option. If you need the implementation, not just the guide, book time.

Stay in the loop

Get the next guide when it ships

I publish architecture guides grounded in real implementations. No generic AI filler.

Use your work email so I can keep the list useful and relevant.

Book Mike directly

Need the implementation, not just the guide?

Book a 15-minute working session with Mike right on his calendar. Tooling, consulting, or a mix of both is fine.

Open Mike's calendar

If you want me to come in with context, leave your email and a short note before the call.

I'll route new requests into the internal website inquiries inbox so I can follow up fast.