Audit duplicates before you merge them in Salesforce
This is the question careful operators ask before a cleanup run: what exactly is this tool about to merge, why did those records cluster together, and how do I prove later that the run did what I approved? The answer is not "trust the score." The answer is a reviewable audit loop.
Audit checklist
Buyer-language framing
Do not jump from "we found duplicates" to "run the merge."
Audit means reviewing the planned action, not just counting duplicate rows.
The safer workflow is investigate, cluster, review, dry-run, then apply with a receipt.
Start with the blocking evidence
A cluster is only as understandable as the anchors that created it. That is why the planner writes anchors into the cluster payload and review queue.
Exact email anchor used to seed high-confidence candidate components.
Exact phone anchor after digit normalization, typically requiring at least seven digits.
Normalized company domain plus normalized name, excluding consumer domains, to catch business-email duplicates without a full scan.
Normalized company plus normalized name, excluding noisy company values, to catch rows that share employer identity but not a clean email anchor.
Then inspect the review queue
The review queue is where the audit becomes operational. These are the fields that actually decide whether a cluster should run.
Stable cluster key used in review queues, apply operations, skips, and receipts.
same_object, lead_to_contact, or cross_object.
merge or lead_to_contact_resolution based on object mix inside the cluster.
The blocking evidence that pulled the records together in the first place.
Heuristic confidence derived from anchors such as email, phone, domain_name, or company_name.
Human-readable cautions such as shared inbox risk, missing exact anchors, large clusters, or mixed Lead/Contact review.
approved, hold, rejected, or pending once the operator fills out the queue.
Optional operator override for the survivor record.
Dry-run command
Use dry-run to inspect operation counts and skipped clusters before live apply.
g-gremlin sfdc merge-apply-plan \
--plan plan.json \
--approval-file review_clusters.csv \
--receipt-file receipt.jsonWhat you should review from dry-run
The operation mix: merge versus convertLead.
The skipped-cluster reasons: approval missing, not approved, invalid override, or manual cross-object resolution.
The receipt payload: planned count, skipped count, and the exact plan digest you are about to trust.
The org-specific caveats: lead status, conversion, and any process that the merge run could wake up.
Read next
Salesforce dedupe guide
The audit-first pillar: blocking-first planning, cluster review, supervised merge, receipts, and where native duplicate rules fall short.
Why Salesforce duplicate rules do not work
Rule-based matching, duplicate-record-set limits, and why alert-or-block is not the same as reviewable clustering.
Merge duplicate leads in Salesforce safely
How to preflight converted status, dry-run the plan, and keep lead-status side effects in view.
Fix duplicate accounts in Salesforce
A candid page on account duplicate cleanup, what native rules do, and where the public Gremlin workflow stops today.
Salesforce dedupe playbook
The applied loop: export, enterprise-plan, review queue, dry-run, and approved apply.
Salesforce lead-status audit
Lead dedupe often wakes routing and status logic, so check what Lead Status actually controls before live merges.
FAQ
Why audit duplicates before merging?
Because the duplicate itself is not the only risk. The merge can change ownership views, lead conversion history, routing assumptions, or downstream reports. An audit-first pass lets you inspect what the tool plans to do before you mutate the org.
What should I look at in the review queue?
Start with cluster_type, recommended_action, anchors, merge_confidence, review_notes, approval_status, and override_master_id. Those fields tell you why the cluster exists, what the planner wants to do, and whether a human has approved that action.
What should the dry-run tell me?
The dry-run should tell you how many operations would run, what the operation mix looks like, how many clusters are skipped, and whether special cases such as manual cross-object resolution still need attention. If you write a receipt file on dry-run, you also get an artifact you can review before live apply.
Keep the conversation going
These pages are meant to help operators solve real problems. If you want the next guide, grab the low-friction option. If you need the implementation, not just the guide, book time.
Get the next guide when it ships
I publish architecture guides grounded in real implementations. No generic AI filler.
Use your work email so I can keep the list useful and relevant.
Need the implementation, not just the guide?
Book a 15-minute working session with Mike right on his calendar. Tooling, consulting, or a mix of both is fine.
Open Mike's calendarIf you want me to come in with context, leave your email and a short note before the call.