GuideSalesforceAudit

Audit duplicates before you merge them in Salesforce

This is the question careful operators ask before a cleanup run: what exactly is this tool about to merge, why did those records cluster together, and how do I prove later that the run did what I approved? The answer is not "trust the score." The answer is a reviewable audit loop.

Audit checklist

Did preflight confirm the org can execute the planned operation type?
What anchors caused the records to cluster together?
Does the review queue agree with the suggested survivor, or does the operator need override_master_id?
How many clusters will be skipped because approval is missing, not approved, or still cross-object?
What will the receipt file capture if you run dry-run now?

Buyer-language framing

Do not jump from "we found duplicates" to "run the merge."

Audit means reviewing the planned action, not just counting duplicate rows.

The safer workflow is investigate, cluster, review, dry-run, then apply with a receipt.

Start with the blocking evidence

A cluster is only as understandable as the anchors that created it. That is why the planner writes anchors into the cluster payload and review queue.

email

Exact email anchor used to seed high-confidence candidate components.

phone

Exact phone anchor after digit normalization, typically requiring at least seven digits.

domain_name

Normalized company domain plus normalized name, excluding consumer domains, to catch business-email duplicates without a full scan.

company_name

Normalized company plus normalized name, excluding noisy company values, to catch rows that share employer identity but not a clean email anchor.

Then inspect the review queue

The review queue is where the audit becomes operational. These are the fields that actually decide whether a cluster should run.

cluster_id

Stable cluster key used in review queues, apply operations, skips, and receipts.

cluster_type

same_object, lead_to_contact, or cross_object.

recommended_action

merge or lead_to_contact_resolution based on object mix inside the cluster.

anchors

The blocking evidence that pulled the records together in the first place.

merge_confidence

Heuristic confidence derived from anchors such as email, phone, domain_name, or company_name.

review_notes

Human-readable cautions such as shared inbox risk, missing exact anchors, large clusters, or mixed Lead/Contact review.

approval_status

approved, hold, rejected, or pending once the operator fills out the queue.

override_master_id

Optional operator override for the survivor record.

Dry-run command

Use dry-run to inspect operation counts and skipped clusters before live apply.

g-gremlin sfdc merge-apply-plan \
  --plan plan.json \
  --approval-file review_clusters.csv \
  --receipt-file receipt.json

What you should review from dry-run

The operation mix: merge versus convertLead.

The skipped-cluster reasons: approval missing, not approved, invalid override, or manual cross-object resolution.

The receipt payload: planned count, skipped count, and the exact plan digest you are about to trust.

The org-specific caveats: lead status, conversion, and any process that the merge run could wake up.

FAQ

Why audit duplicates before merging?

Because the duplicate itself is not the only risk. The merge can change ownership views, lead conversion history, routing assumptions, or downstream reports. An audit-first pass lets you inspect what the tool plans to do before you mutate the org.

What should I look at in the review queue?

Start with cluster_type, recommended_action, anchors, merge_confidence, review_notes, approval_status, and override_master_id. Those fields tell you why the cluster exists, what the planner wants to do, and whether a human has approved that action.

What should the dry-run tell me?

The dry-run should tell you how many operations would run, what the operation mix looks like, how many clusters are skipped, and whether special cases such as manual cross-object resolution still need attention. If you write a receipt file on dry-run, you also get an artifact you can review before live apply.

Keep the conversation going

These pages are meant to help operators solve real problems. If you want the next guide, grab the low-friction option. If you need the implementation, not just the guide, book time.

Stay in the loop

Get the next guide when it ships

I publish architecture guides grounded in real implementations. No generic AI filler.

Use your work email so I can keep the list useful and relevant.

Book Mike directly

Need the implementation, not just the guide?

Book a 15-minute working session with Mike right on his calendar. Tooling, consulting, or a mix of both is fine.

Open Mike's calendar

If you want me to come in with context, leave your email and a short note before the call.

I'll route new requests into the internal website inquiries inbox so I can follow up fast.