ANalysis Of VAriance (ANOVA)

A Brief Review

Earlier, we covered the nature of data and cross tabulations with a \(\chi\)² test for independence. Nominal and ordinal data are referred to as nonmetric data. When data is “nonmetric,” it means we cannot perform arithmetic operations such as addition, subtraction, multiplication, and division with it. And, if we can’t do arithmetic operations, we cannot calculate mean, standard deviation, skewness, and kurtosis. But, as we saw, we can still apply statistics to nonmetric data. We just need to use nonparametric analytical tools such as cross tabulations with a \(\chi\)² test for independence for nominal data and Spearman’s Rank Rho for ordinal data. (Alas! We do not yet have content for the latter, but you still have the internet, so, with luck, that is not a complete showstopper for you.)

Metric data, on the other hand, is data on which we can perform arithmetic operations such as addition, subtraction, multiplication, and division. And, hence, we can calculate mean, standard deviation, skewness, and kurtosis. To analyze metric data, we use “parametric analytical tools” such as ANOVA, correlation, and regression.

Enter ANOVA

Recall that, with a cross tabulations with a \(\chi\)² test for independence, we are testing an association of two variables, and we rely on the frequency between the cells to do that.

If we want to measure the differences in those levels, then we need to use ANOVA. To apply ANOVA, we need two items:

At least one independent variable that is nominal in nature and has at least two levels, and
Only one dependent variable that must be interval or ratio in nature.

If we have one independent variable, then we apply a 1-way ANOVA. If we have two (or more) independent variables, then we can apply an n-way ANOVA. If we have collected observations of the dependent variable several times, then we apply the repeated measures ANOVA.

Consider the question: “Does the number of visits depend on device type?” This turns out to call for a 1-way ANOVA.

Note that we are no longer merely asking if there is a relationship between these two variables. But, rather, we are looking for dependence. Let’s check that we have what we need to apply ANOVA:

At least one independent variable – The device type serves as the independent variable, it’s nominal in nature because:
Order does not matter (Laptop/Desktop, Tablet, Phone vs. Phone, Tablet, Laptop/Desktop)
There is no “distance” between observations
There is no “true zero”
Only one dependent variable – the number of visits serves as the dependent variable, and it is ratio in nature:
Order does matter
There is distance between observations
It includes a “true zero”

The levels for our independent variable, in this example, include “laptop/desktop,” “mobile,” and “tablet.” Hence, the independent variable, device type, includes three levels.

A Quick Aside: Up to this point, in our R journey, we’ve treated “factors” as something of a nuisance that, if not converted to character vectors, can cause odd and frustrating results. Read up more on that here. But, hopefully as you get into the content here and beyond, you will start to see how factors – and how many levels a factor has – start to become important. Now, back to our 1-way ANOVA…

The hypothesis (“null hypothesis” or \(H_0\)), in this example, would be that means (or averages) between levels equals zero. That is: there is no difference between levels.

In the cross tabulations with a \(\chi\)² test for independence, the hypothesis was, “there is no relationship between the two variables.”

The ANOVA hypothesis is similar to the \(\chi\)² test for independence hypothesis because both are about “no relationships.”" The ANOVA hypothesis is dissimilar to the \(\chi\)² test for independence hypothesis, though, because the ANOVA includes an arithmetic value (i.e., the means are equal to zero).

1-Way ANOVA

A one way ANOVA with one independent variable and only two levels will generate the same results as an independent t-test (which, mysteriously, we’re not going to define here; perhaps, some day, we will come back to that). Instead of looking at the t-test though, we will look at the F ratio.

Recall that we started with an expected value. We then took an observation. The distance between the expected value and the observed value is called the error. The relevant and necessary question becomes: “Is this error attributable to laptop/desktop, mobile, and tablet (i.e., within the group) or to device type (i.e., between the group)?”

That is, “Is this error within the variables or between the variable?”

\(MS_w\) or \(MS_e\) is the mean square error within the variables. In this example, \(MS_w\) or \(MS_e\) refers to error within laptop/desktop, mobile, and tablet.

\(MS_b\) is the mean square error between the variables. In this example, \(MS_b\) refers to error in the device type.

So, hold onto your socks for a minute while we drop some formulas.

Ultimately, we want to get an “F ratio,” which, similarly to how we compared the actual \(\chi\)² to the critical \(\chi\)², we will compare to a table of F distribution values with a select \(\alpha\) level.

“Simply” put, the F ratio is: \[F = \frac{MS_b}{MS_e}\]

Of course, we have to actually figure out what \(MS_b\)_ and \(MS_w\) are, which requires breaking things down a bit:

\[MS_b + MS_e = Total~Error\]

\[MS_e~(or~MS_w~) = \frac{SS_e~(or~SS_w)}{N-K}\]

Where:

\(SS_e\) (or \(SS_w\)) is the Sum of Squares error within
\(N\) is the number of groups
\(K\) is the number of levels.

\(N - K\) is the degrees of freedom (or “d.f.”) for the within effect.

And:

\[MS_b = \frac{SS_b}{K-1}\] where \(K-1\) is the d.f. for the “between effect” and \(SS_b\) is the Sum of Square error between.

The relationship between \(SS\) and \(MS\) is the d.f.

\(N - K\) allows us to control for how many groups we are using.

Still following?

Once we have our observed F ratio, we can look up the critical value for the F ratio in a table. Note that you will need the d.f. for both MS~b and for MS~e (and, once again, you will have to choose your \(\alpha\) level). (You can also use the “quantile function” – qf() in base R to get the critical value).

A 1-way ANOVA is appropriate if we want to segment on one variable such as device type or last touch channel.

The Rejection Region

The rejection region is important. Nobody likes rejection…! Except, in statistics, we actually do. A key point, that we’ll want to keep reiterating:

The null hypothesis (\(H_0\)) is, essentially: “there’s nothing to see here.”
When we reject \(H_0\), that means there actually is some sort of relationship.

As such, \(\alpha\) is how we lay out how confident we want to be in this double-negative world (rejecting the hypothesis that there is nothing going on). I like 5% or .05 for a two tail test, which means the rejection area will be .05/2 on the left and right of the mean.

I like a two tail test because I do not have to think about negative values. The probability of drawing a -2.5 is just as likely as drawing a 2.5.

If we reject the \(H_0\) because the observed value exceeds the critical or cutoff value, then we say the difference is the hypothesis. The means the two distributions are not equal.

For a p-value of less than .05, we reject if the t-value exceeds 1.96.

The t-value is referred to as the critical or cutoff value.

Example of 1-way ANOVA

Let’s consider three forms of channels, including:

Desktop/Laptop
Mobile
Tablet

Also, let’s consider the number of goal completions during a five-day period. Table 1 provides a display of the relationship.

Table 1: Number of Goal Completions by Channel

Desktop/Laptop	Mobile	Tablet
5	4	4
3	5	2
4	4	3
5	3	1
4	3	2

1. Calculate the Mean for Each Channel

The averages for each channel are:

Desktop/Laptop (mean 1) = 4.2
Mobile (mean 2) = 3.8
Table (mean 3) = 2.4

2. Calculate the Mean for All Three Channels Combined

This is called the “grand mean.”" The sum for the three channels is 52 and the mean is 3.47.

3. Determine the Sum of Squares Error (\(SS_e\))

We calculate \(SS_e\) by squaring the difference between the observed value and the expected (mean) value:

Table 2: Squared Error Values

Desktop/Laptop	Mobile	Tablet
\((5-4.2)^2=0.64\)	\((4-3.8)^2=0.04\)	\((4-2.4)^2=2.56\)
\((3-4.2)^2=1.44\)	\((5-3.8)^2=1.44\)	\((2-2.4)^2=0.16\)
\((4-4.2)^2=0.04\)	\((4-3.8)^2=0.04\)	\((3-2.4)^2=0.36\)
\((5-4.2)^2=0.64\)	\((3-3.8)^2=0.64\)	\((1-2.4)^2=1.96\)
\((4-4.2)^2=0.04\)	\((3-3.8)^2=0.64\)	\((2-2.4)^2=0.16\)

The sum of each group errors is:

Desktop/Laptop (group 1) = 2.8
Mobile (group 2) = 2.8
Tablet (group 3) = 5.2

That gets us to the sum of squares errors (\(SS_e\)):

\[SS_e = 2.8+2.8+5.2 = 10.8\]

4. Determine the Sum of Squares Between Groups (\(SS_b\))

We determine \(SS_b\) by subtracting the group mean from the grand mean and squaring the difference before multiplying by the number of observations in each group. Finally, we sum the products.

Table 3: Squared group error values

	A —————— # of Observations	B ————- Grand Mean	C ————- Group Mean	D ——— (B-C)²	E ——- A*D
Group 1 (Desktop/Laptop)	5	3.47	4.2	0.53	2.65
Group 2 (Mobile)	5	3.47	3.8	0.11	0.55
Group 1 (Tablet)	5	3.47	2.4	1.14	5.70

By summing the three values from column E in Table 3, we get the sum of square errors between groups (\(SS_b\)):

\[SS_b = 2.65 + 0.55 + 5.70 = 8.90\]

5. Calculate the Mean Square Errors (\(MS_e\))

We calculate \(ME_e\) by dividing the \(SS_e\) by the degrees of freedom. Recall, the degrees of freedom is the number of observations (\(N\)) less the number of groups or levels (\(K\)). In this example, the degrees of freedom is

\[df = N - K = 15 -3 = 12\] So:

\[MS_e = \frac{SS_e}{df} = \frac{10.8}{12} = 0.9\]

6. Determine the Mean Square Between Groups (\(MS_b\))

We determine \(MS_b\) by dividing the sum of square between groups (\(SS_b\)) by the degrees of freedom. Here, degrees of freedom is determined by subtracting 1 from the number of groups or levels (K):

\[df = 3 - 1 = 2\]

So:

\[MS_b = \frac{SS_b}{df} = \frac{8.9}{2} = 4.45\]

7. Determine the Actual F-value

(We’re close!) We determine the actual F value by dividing \(MS_b\) by \(MS_e\):

\[F = \frac{MS_b}{MS_e} = \frac{4.45}{0.9} = 4.94\]

8. Look Up the Critical F-value

We look up (or ask R to look up, using the base R qf() function) the critical F value. The numerator includes 2 degrees of freedom while the denominator includes 12 degrees of freedom. With a rejection region (\(\alpha\)) of .05, we can look up that the critical F value is 3.88.

9. Compare the Actual and Critical F-values

Finally, we compare the actual F value (4.94) to the critical F value (3.88), to determine that we reject the null hypothesis (\(H_0\)) that the means of the groups are equal. Therefore, we conclude that the levels or groups are different. In this example, we could, therefore, state that the number of goal completions does differ by channel.

N-way ANOVA

If we only have one independent variable – as in the example above, then we can use a 1-way ANOVA (um…duh…right?). But, if we want to segment or otherwise consider two (or more) independent variables, then we need the N-way ANOVA. For example, if we want to consider device type and last touch channel on the number of goals completed.

The N-way ANOVA allows the web analysts to consider two issues:

The interaction of two variables. That is, if/how device type and last touch channel work together – referred to as the “interaction effect.”" If the interaction is statistically significant, then the web analyst should not interpret device type and last touch channel separately when using ANOVA. To interpret the interaction effect, we argue for that as device type and last touch channel change, the number of visits changes. If the interaction effect is not statistically significant (i.e., fail to reject \(H_0\)), then the web analyst should interpret separately the two independent variables (e.g., device type and last touch channel), which is pretty much two one-way ANOVA.
Regardless of whether the interaction effect appears statistically significant, the digital analyst should calculate eta square (\(\eta^2\)). \(\eta^2\) is calculated by dividing \(SS_x\) or \(SS_b\) by \(SS_y\).

0 means \(SS_b\) or \(SS_x\) has no effect on \(SS_y\).
1 means \(SS_b\) or \(SS_x\) explains all variance of \(SS_y\).

Usually, the \(\eta^2\) value falls between 0 and 1. It provides a method to determine the effect of \(SS_b\) or \(SS_x\) on \(SS_y\).

Let’s consider the following table with last touch channel on the column and device type on the row. The cells contain the number of goals completed (Table 2).

Table 2: Mean Number of Goals Completed by Device Type and Last Touch Channel

	Organic Search	Paid Search	Email	Display	Overall Averages by Device Type
Laptop/Desktop	325	215	225	325	272.50
Tablet	275	165	175	160	228.75
Phone	245	135	145	110	166.25
Overall Averages by Last Touch Channel	276.67	203.33	200.00	210.00

For this example, let’s assume the interaction effect of device type and last touch channel appears statistically significant.

Main Effect Interpretation

Device type and, separately, last touch channel appear to have a main effect on number of goals completed. We note that laptop/desktop users are more likely to complete more goals compared to phone users. Similarly, customers who rely on organic search to reach our website are more likely to complete more goals compared to other points of last contact.

Interaction Effect Interpretation

The difference for laptop/desktop users compared to tablet and phone users remains constant across the last touch channel until display is considered. The difference changes between device type at display. Hence, we can state that the relationship between device type and number of goals completed interacts with the last touch channel.

The interaction in this example is referred to as Ordinal Interaction because the order of the levels matters. Table 3 provides an example of no interaction.

Table 3: Mean Number of Goals Completed by Device Type and Last Touch Channel

	Organic Search	Paid Search	Email	Display	Overall Averages by Device Type
Laptop/Desktop	325	215	225	235	250
Tablet	275	165	175	185	200
Phone	245	135	145	155	170
Overall Averages by Last Touch Channel	281.67	171.67	181.67	191.67

Table 3 shows no interaction. As we move down device type and across last touch channel, we see a monotonic transformation. That is, the pattern remains constant.

Table 4 highlights an example of disordinal interaction with non-crossover.

Table 4: Mean Number of Goals Completed by Device Type and Last Touch Channel

	Organic Search	Paid Search	Email	Display	Overall Averages by Device Type
Laptop/Desktop	325	215	225	235	250
Tablet	275	185	170	210	210
Phone	250	170	120	160	175
Overall Averages by Last Touch Channel	283.33	190.00	171.67	201.67

As we move down device type and across last touch channel, we see the differences change. However, the cells never cross. That is, the number of goals completed for laptop/desktop never falls below tablet or phone regardless of last touch channel.

Finally, Table 5 provides an example of disordinal interaction with crossover.

Table 5: Mean Number of Goals Completed by Device Type and Last Touch Channel

	Organic Search	Paid Search	Email	Display	Overall Averages by Device Type
Laptop/Desktop	325	215	200	235	243.75
Tablet	275	185	230	175	216.25
Phone	250	170	120	190	182.50
Overall Averages by Last Touch Channel	283.33	190.00	183.33	200.00

As we move down device type and across last touch channel, we see the differences change and the cells cross. That is, the number of goals completed for laptop/desktop falls below tablet at email and phone appears greater than tablet at display.

The type of interaction as discussed in Tables 2, 3, 4, and 5 give a completeness to the analysis. The interaction effect serves as the singular reason to perform this analysis.