This is a very common problem we see with a lot of businesses. Many have issues with duplicate records in their Salesforce. Duplicate leads, duplicate contacts, duplicate accounts or companies. Or other duplicate records. I find the best way to deal with duplicates is to identify all your data sources bring them into a single source, which may be Salesforce and then maintain a clean database by first de-duplicating all the records and then setting up a program to block duplicates going forward. If you can’t block the duplicates going forward on a one at time basis, if for example you are importing lists of data, then you can setup auto deduplicating scenarios to run in the background.
I have checked out and tested and reviewed several de-duplicating salesforce apps. My favorite is Demand Tools from CRM Fusion. They also have an awesome dupe blocking program called Dupe Blocker.
De-duplicating salesforce records in a manual way or by sorting alphabetically is usually not good enough. What if a company name is entered with a Co. or Company or Inc or without the The in front of the name. What if the full name matches and the address but not the phone. That is why de-duping your Salesforce records using some program which has sophisticated algorythims of finding and merging duplicates is way more efficient and effective.
When using a program like Demand Tools and deduping your Salesforce data you will dedupe each object one at a time, starting from the top master object. Usually in Salesforce I will start with Accounts. Then work my way down to Contacts, then Leads, then Leads to Accounts, then Leads to Contacts. That is the best order for a standard dedupe. You need to consider the data your deduping in Salesforce to build the strategy and although there are a lot of similarities with business data, it’s all different. There are several sources of data usually and the business knows their data the best, so we like to ask a lot of questions before starting a project like this to understand the data and why there are duplicates to figure out the method of deduping it correctly. Some data we want to keep as the master data. There is also the logic to consider when merging two duplicate records. What record is the master record, what record is the most important. It might depend on the source of the data, or which record was modified most recently, or who owns the record, etc.
For your salesforce deduping scenarios you want to start with a “Rigid” deduping scenario. This means that you set up criteria of matching duplicate records that is strict. For example in Accounts you would say that the Account name matches, the Address matches, the city matches, the phone matches, the state matches. That means it should be pretty darn safe that they are duplicate records. You can pretty much auto merge all those matching records. Then you go with a “Semi-Rigid” scenario where perhaps the Account Name matches and the phone matches. You might glance through those records to check if you need to review them before auto merging them. Then you do a “Loose” scenario where maybe just the Account name matches. Usually this requires you to review the dupe records each one at a time to confirm that they really are duplicates. Often this stage can leave some duplicate records that some third party who doesn’t know the account records very well not sure if they are duplicates or not and will require someone intimately familiar with the company data to confirm and check.
Another concern when de-duping a lot of records is reaching your API limits. The Demand Tools program uses the Salesforce api and uses api calls. You have a limited amount of calls you can make each day depending on your edition and how many licenses you have. If you have a ton of data and a ton of de-duping to do, then you may need to try to segment your data to do it in small batches. Try to think of a way to segment the data that still allows you to find duplicates. For example you can filter all Accounts that start with the letter A, B, C, D. Dedupe those, then move on. You may need to also dedupe records over several days in some cases. You may also purchase additional api calls temporarily from Salesforce for this project.
As I said, once your data is clean the best thing to do is to have a process to avoid duplicates in the future. This is where setting up a dupe blocker type program is awesome and very helpful. There is a free program called Dupe Catcher which works in Professional edition of Salesforce. This program is good if you have professional edition and don’t want to spend money and your needs are just for manual records being created in Salesforce to be blocked for duplicates. But Dupe Blocker from CRM Fusion is a very sophisticated dupe blocker program. Their matching algorithms are more advanced. You can also setup auto merge or auto block rules for records that are created from a web 2 lead form or any record created from any form or api integration. It doesn’t cost that much and I think it’s worth it if you need it. You can setup all sorts of different scenarios to automatically handel potential duplicates.