# Use Automated Cross Correlations in Excel to Find Leading Indicators—Part 1

 by Charley Kyd, MBAMicrosoft Excel MVP, 2005-2014 The Father of Spreadsheet Dashboard Reports

It all seems so simple...

To improve your forecasts of sales or other measures, you simply need to find leading indicators...measures that are highly correlated with your key measures, but with a time lag. And then you use those leading indicators as the basis of your forecast.

But when you try to make it all work, that simple idea can become a huge challenge.

Suppose, for example, that the blue Data1 line in this chart shows what you spend in advertising. And suppose that the red Data2 line shows the amount of your sales.

At first glance, it looks like you REALLY need to change your ad strategy, because...

If you want to be more precise in your analysis, you could use Excel's CORREL function to learn that Data1 and Data2 have a correlation coefficient of -.50. That is, as the chart illustrates, your advertising and sales values are negatively correlated to a significant degree.

However, that's not the end of the story.

It could be that what you spend on advertising is a leading indicator of what your sales will be several months later. This figure shows the analysis from that perspective...

Here, the table shows the correlations associated with eleven different time shifts. The version with the highest correlation has a shift of +2 months. That is, the correlation has been calculated with sales (Data2) shifted two months ahead of ad spending (Data1).

That is, ad spending could be a good leading indicator of sales performance two months later.

However, that interpretation isn't the only one you could make, as this figure illustrates:

Here, we see that sales performance could be a leading indicator of ad spending three months later. This might be because when sales rise or fall, the Marketing Department decides to spend more or less on advertising.

Both interpretations are supported by high correlations. But the reality is that you must properly interpret what analyses like this tell you.

Even so, the analysis definitely can give you additional facts on which to base your decisions. So, with that warning, let's set up the analysis.

### Cross Correlation Workbook

My workbook contains two relevant worksheets: Data and Report.

This figure shows the Data worksheet.

The Date, Data1, and Data2 columns contain the values shown.

The DateText column contains formulas that return text to be displayed in the chart. Here's the first formula, for the cell shown:

E5:   =TEXT(B5,"mmm")&CHAR(13)&"'"&TEXT(B5,"yy")

The CHAR(13) section of this formula returns the carriage-return character. Excel doesn't show this character in column E. But when the text is displayed in the chart, this character causes the year text to wrap to a second line below the month text.

The NumRows cell returns the number of rows in the table. It does so because it uses the COUNT function, which counts only numbers. We'll reference this value in several dynamic range names in this workbook.

Here's the formula for that cell:

C1:   =COUNT(B:B)

To name this cell, select the range B1:C1 then press Ctrl+Shift+F3 or choose Formulas, Defined Names, Create From Selection to launch the Create Names dialog. Make sure that only Left Column is checked; then choose OK.

Also, in the Report worksheet, set up the two cells shown here, then use the Create Names dialog to assign the name Shift to cell B1. (When I set up a cell with a setting, as with the Shift cell, I often give it a yellow fill.)

Set Up the Dynamic Range Names

The key to automating the cross-correlation calculations is to set up dynamic range names that expand when more data is entered, or that shift the data in response to the Shift value.

The NumRows cell makes it easy to set up dynamic range names that expand to include additional rows of data that might be added below row 25 in the data figure above.

To define the first name below, first copy the formula for the Date name to your clipboard. Then  choose Formulas, Defined Names, Define Name (or press Ctrl+Alt+F3) to launch the New Name dialog. In the New Name dialog, enter Date as the name; paste the copied formula into the Refers to edit box; and then choose OK.

Repeat this process for each of the remaining names.

 Date Data1 Data2 DateText =OFFSET(Data!\$B\$4,1,0,NumRows,1) =OFFSET(Data!\$C\$4,1,0,NumRows,1) =OFFSET(Data!\$D\$4,1,0,NumRows,1) =OFFSET(Data!\$E\$4,1,0,NumRows,1)

You now need to set up dynamic range names that shift the data they reference, in ways that vary depending on the sign of the Shift value. Use the New Name dialog to define each of these names.

 s.Data1N s.Data1P s.Data2N s.Data2P s.DateText1N s.DateText1P s.DateText2N s.DateText2P =OFFSET(Data1,-Shift,0,NumRows+Shift,1) =OFFSET(Data1,0,0,NumRows-Shift,1) =OFFSET(Data2,0,0,NumRows+Shift,1) =OFFSET(Data2,Shift,0,NumRows-Shift,1) =OFFSET(DateText,-Shift,0,NumRows+Shift,1) =OFFSET(DateText,0,0,NumRows-Shift,1) =OFFSET(DateText,0,0,NumRows+Shift,1) =OFFSET(DateText,Shift,0,NumRows-Shift,1)

In these range names, the "s." indicates that the names are shifting your data; the "N" indicates that the name is used when the Shift value is negative; and the "P" indicates that the name is used when the Shift value is positive.

The final dynamic range names are ones we chart and use in our calculations. Because an Excel bug has problems charting range names that begin with "c" (which we would like to use for "chart"), we begin them with "g" (for "graph").

 g.Data1 g.Data2 g.DateText1 g.DateText2 =IF(Shift<0,s.Data1N,s.Data1P) =IF(Shift<0,s.Data2N,s.Data2P) =IF(Shift<0,s.DateText1N,s.DateText1P) =IF(Shift<0,s.DateText2N,s.DateText2P)

Now, with the dynamic names defined, you can set up a data table to calculate the cross correlations.

### Set Up the Excel Data Table

This figure shows the full report area. The Data Table in column J and K calculate the cross-correlation values.

To set up the Data Table, first enter the shift values shown in the range J7:J17. Then enter this formula in the cell shown:

K6:   =CORREL(g.Data1,g.Data2)

This formula returns the correlation coefficient for the two dynamic ranges shown. These ranges, of course, will shift in time...depending on the value in the Shift cell.

Next, select the range J6:K17 and then choose Data, Data Tools, What If Analysis, Data Table to launch the Data Table dialog. In the Column Input Cell of that dialog, enter...

=Shift

...which is the name of the cell that contains the current value for the number of months that the data is shifted.

Then, after you choose OK, Excel puts each selected value of column J into the Shift cell, calculates the formula in cell K6, and writes the result of the formula in the adjacent cell in column K.

Finally, to assign range names to the Data Table for easy reference, enter the text shown at the bottom of the table; select the range J7:K18; press Ctrl+Shift+F3; in the dialog, make sure that only Bottom Row is checked; and then choose OK.

### Set Up Shift's Data Validation List

The Shift cell uses Excel's Data Validation List feature to allow only the Shift values listed in the Data Table. To set this up, first select cell B1.

Now choose Data, Data Tools, Data Validation, Data Validation. In the Settings tab, choose List in the Allow section. And for the Source, enter...
=ShiftVal
...which is the range name you assigned to the list of shift values in the Data Table.

You can take the next steps with this in three ways. First, in part 2 of this article, you can learn how to create the charts and complete the cross correlation report.

Second, you can . By doing so, you'll also be notified when I post Part 2 of this article.

Tags: #excel, #chart, #cross correlation, #time-shift correlation, #reporting

Click to see testimonials from readers

Click to see who uses Excel dashboards.