Comparing Arrays

Comparing items is one of the main tenets of Excel. Recently, Excel released Dynamic Array and Spill Range functionality to most Excel 365 users. Dynamic Arrays are incredibly powerful and could fundamentally change how we use Excel. However, array functionality has been around for a long time. This article will explore how you can use arrays to return differences between data sets.

As part of a recent Consulting project, our client wanted to identify and report which fields had changed between versions of the data. The reporting requirement raised challenges because it meant we couldn’t use conditional formatting.

Instead, we used a series of advanced Excel formulas to generate a list of the fields which changed for each data record. 

=TEXTJOIN([delimiter],TRUE,IF([Range1]<>[Range2],[Output],””)

 

The formula can be broken down into a complex array IF function wrapped in a standard TEXTJOIN function.

Using arrays in the “logical test” portion of the IF function means that the function will produce an array. The IF function evaluates each set of data in turn and returns the results in an array four columns wide. If the entries do not have the same value, they return the field label, otherwise it returns a blank. 

 

The IF function is wrapped in a TEXTJOIN function so that the result is contained within a single cell. It is important for the “IF false” field to be a blank so that the TEXTJOIN function can skip over fields which are the same.

The data does need to be organized in a certain format:

  • All of the comparison and output ranges need to have the same dimensions (x rows by y columns)
  • Columns need to be in the same order

Unfortunately, this formula is not compatible with Dynamic Ranges because the TEXTJOIN function cannot generate a spill range. A similar result can be achieved using the MMULT function, but that is another level of complexity. 

The above example is the most basic implementation of this formula. However, it is possible to augment the formula to make it even more powerful. This could include:

    • Outputs can be more than just one text string. You can create more complex Outputs by using text concatenation.

 

  • The comparison ranges don’t need to be fixed ranges. You can use the INDEX function to apply criteria and return an array.
  • This formula construction is compatible with Excel Tables. The fixed cell references can be replaced with Excel tables structured references.
  • The comparison and output ranges do not need to be contiguous ranges. You can use the INDEX function and a field Array to return a new array which is a subset of the original array.

=TEXTJOIN([delimiter],TRUE,IF([INDEX(Range1],{Fields})<>INDEX([Range2],{Fields}),
INDEX([Output],{Fields},””)

 

If you want to learn more about how to implement this and other advanced Excel topics into your models, please check out our Resources page.

Written by Brendan Stephenson, Director at The Marquee Group.