Get started

Private Equity

Get started

Get started

Salesforce

Get started

Get started

DataGroomr

Get started

Deduplication of 1.2M Accounts in Salesforce

Context

Impact

To build a global database of potential acquisition targets, the client—a private equity firm—developed a high-volume “account sourcing engine” in Salesforce. Sourcing & Enrichment tools like SourceScrub, Grata, 6sense, RocketReach, and Apollo were integrated to ensure no relevant company was missed.


But this approach led to a data overload. Up to 20 duplicates of the same entity appeared—caused by slight differences in naming, domains, or missing hierarchies. Subsidiaries were often logged as standalone parent accounts.


The impact was significant: rep conflicts, double outreach, conflicting intel, wasted resources - also preventing opportunities to automate various business processes. With trust in the CRM eroding, the system was no longer reliable.

Avatar image for about me page

The full solution was implemented and deployed to production in under two months, with automated logic for both retroactive deduplication and proactive duplicate prevention.

The result was a clean, trusted CRM foundation that improved rep confidence, reduced redundant outreach, and restored the reliability of downstream reporting and automation.

The goal was to merge duplicate records without losing valuable or current account data, while also resolving ownership conflicts between duplicate accounts assigned to different Account Executives. The deduplication process followed four key steps

1. Generation

A unique primary key (PK) was generated based on normalized website URLs to match duplicate accounts reliably.

2. Identification & Blocking

Proactive logic was added to flag duplicates during record creation (before save), while still allowing distinct subsidiaries and similar entities to be created separately.

3. Scoring

A custom scoring algorithm evaluated record quality by weighing field completeness, data freshness, and recent AE activity, determining a “winning” record in each cluster.

4.Merging

Winning records were enriched with the most accurate values from the group, preserving ownership and consolidating key account details.

The Solution

Deduplication Process