Identifying duplicates in a transaction file - The simplest approach

In this post we share a short video on the steps to identify duplicates in a file.

There are many reasons for wanting to identify duplicates in a transaction file. It could be that the transaction file is a dataset of payments made to vendors, in which case identifying the duplicates can help your business recoup costs. It could be that the duplicates in the file are multiple entries of patient data. It could be that the duplicates are repetitive records of security logs. What ever the reason, when performing data analytics there is good reason to be able to effectively manage duplicates.

There is a duplicates function in Excel, but it works on the opposite principle to the approach used by Arbutus Analyzer. In Excel, running the duplicates function removes the duplicate transactions from the dataset - with no transaction log. In Arbutus Analyzer the duplicates function identifies the duplicate transactions and reports those transactions for you.

Let's see how it is done with this short video:


The steps shown in the video are:

  1. Import your file

  2. Select the "Analyze" menu

  3. Select the "Duplicates" menu item

  4. Select the fields you want to test

  5. Click "OK"

  6. Review the results in the "Command Log".

  7. Click on the hyper links in the command to see the selected transactions

So simple and so fast.

0 views

© 2020 by Survey Design and Analysis Services. 

  • LinkedIn
  • Facebook
  • Twitter
  • YouTube