Take a Scrub Brush to Your Data!

There is nothing like dirty data to undermine everyone’s confidence in your Salesforce database. Telltale signs:

  • – You find the fundraising and development people with their own individual spreadsheets, hoarding their data like 10-year-olds after the annual Hallowe’en night of Trick or Treating.
  • – Your CEO asks for a specific report and it takes more than an hour for the department to produce it.
  • – The monthly board meeting is coming up and everyone takes the afternoon off to get the reports generated. Even worse, nobody on the board actually READS the reports.

Running an organization without complete and accurate data is like driving through a blinding snowstorm or a downpour of rain. Nobody can truly see where the business is going!

Missing information is the most likely culprit, 78% of nonprofit staff members report. Next, 67% are plagued with general typos, misspellings, incorrect IDs, and wrong dates. And, it’s often difficult to see who’s making the mistakes, so no one is learning from their mistakes.
You may be completely familiar with creating reports to show your positive results. Did we serve more homeless people this year than we did last year? How much did the percentage of women served go up? How much did the annual gala bring in and who were our major donors?
Tackle the dirty data with NEGATIVE reports. Look for the mistakes, missing info, and bad dates and you can figure out how to fix the underlying problem. Look for reports by types of errors—then you can more easily clean them up.
Judy, our executive director, wants to send a Happy Birthday to all the students in our database. She knows that keeping in touch with engaged individuals is what will make our job a success. She wants to see the folks getting birthday greetings next month.
Just eyeballing the data convinces us we have a dirty data issue. Let’s take a scrub brush to the birthday data. Both missing dates and clearly inaccurate dates are the black spots.

Missing or inaccurate data

We set up an Apsona report that looks like this:

  1. We add the Salesforce Contact ID to take us directly back to the Contact record if we need to check up on something.
  2. With Apsona, we can easier see both the birthdate (and quickly find those that are missing) AND just the year, to check for those that are clearly too old to represent clients.
  3. We also have the record owner (usually the person who created the record) and the name of the user who last modified the record.
  4. Note that we can change the filters from the report screen. Find all the records missing data, find all the dates outside of the plausible range. In our case we want to question anybody over 65 or under 18.

Notice all the choices we get when filtering on a date field!

Here are the filters to catch ALL the birthdate errors.

We will probably want to split these to have a bad date report and a no date report. But let’s see what we’ve got first. There are 103 date errors ranging from folks born in 2788 to a several who are supposedly in their 90s. Worse, though, are the empty date fields—2266 of them.
The next task will be to find out which of the Organization staffers need some extra training about adding these birthdates. We created a Visualization on the completed report that Groups by Owner First Name, and shows the number of records in the group. No need to set up a formula to get the record count.

Let’s find out which staff members are entering improbable dates. That is perhaps a different training. First I’ll clone our Birthday Report. In the report list, click the report, then click “Clone.” Then I rename the cloned report to something meaningful.

This time I’ve set the filters to show only the records outside of our normal range of students, and I don’t want to see the empty ones.

The same five users with the highest blank birthdates also have the highest date mistake count. We’ll export these reports and send them to the staff, to correct the inaccurate dates and to reach out to the students to get the missing birthdates.
Set up reports on the accuracy and completeness of every field on every record where the data is critical to your mission.

Duplicates in the System

The best way to deal with duplicates, is not to import them in the first place. “Hit that lookup button to SEARCH FIRST” should be the watchword of every staffer. But staffers do get busy and the occasional dupe makes it in. Apsona will reject importing or updating records if there are duplicates in the system.
Create a dupe-catching report: Add the fields that you think will distinguish one contact (account, opportunity, etc.) from another. For Contact, Full Name (which is a blessing, don’t you think? Not having to put FirstName and a separate LastName), Contact ID, Mailing City, Created Date and Owner. I’m just using First Names there to protect the innocent!
Using Excel—the details of this part are outside the scope of this blog entry—you can find and highlight all the dupes, find the owners who put the records into the system, and provide them with that spreadsheet to merge records, if necessary.

Set up a System to Deal with Data Quality Issues

  1. Identify the data quality issue.
  2. Is training the solution? Do you need staff-wide training or individual training?
  3. Is the database set up to be user friendly? Is staff entering data points only once? Can you automate some entry so staff has fewer fields to fill out?

Monitor Data Quality regularly with your Apsona reports. Don’t wait to start cleaning your data. Some fancy brushwork now can save some tedious scrubbing on your hands-and-knees further down the road!
Soon you’ll be seeing the benefits – saving time correcting errors, increasing overall accuracy, and improving the core confidence all the staffers, from the executive director right down to the data entry intern, have in your Salesforce system. Thanks, Apsona!

Thanks to TaroWorks.org for the basic principles expressed here. Their 2015 webinar on data quality inspired me to ask what Apsona could do to help.