Stratifying Data in Arbutus Analyzer

The ability to group data based on character, numeric or date ranges is highly useful in a range of contexts. Auditors and those interested in audit analytics require this ability, and Arbutus Analyzer has several grouping commands (classify, stratify, age, summarize, and cross tabulate) to help users achieve the desired results.

Arbutus Analyzer’s grouping commands allow users to generate graphs or tables to visualize the data is distribution across defined intervals.

The grouping commands are listed below:

  • Classify (use on one character field)

  • Stratify (use on one numeric field)

  • Age (use on one datetime field)

  • Summarise (use on one or more character, numeric, and/or datetime fields)

  • Cross Tabulate (use on one or more character rows and one character column)

In the following example we will stratify the data to look at the distribution of invoices in my data file according to invoice price. To do this I use the Stratify command. It examines the field “Price”, and generates a histogram of the results. The steps to perform this stratification are:

  1. Open the data file with which you are working

  2. Select Analyze > Stratify

  3. Select the field you wish to apply the command to. Note: the field must be numeric. For character or datetime fields, please select another of the appropriate grouping commands listed above.

  4. Specify under interval options if you want to use even intervals or specify intervals. Enter the interval amounts for either option.

  5. Select More to choose between output options (Screen, Data, Graph)

  6. Click OK.

In this example, I applied the command to the Price column of my data set, selected even intervals at the value 10 (to split amounts into 10 even intervals), and I selected the output option as Graph to generate a histogram.

The results are shown below.

These results show that the vast majority of all invoices were represented by payments between 0.01 and 51.00 dollars in the data file. It also demonstrates that the majority of value in the data file is derived from payments in three different intervals.


Recent Posts

See All

© 2020 by Survey Design and Analysis Services. 

  • LinkedIn
  • Facebook
  • Twitter
  • YouTube