Two-Way ANOVA (Analysis of Variance) in Microsoft Excel

For those times when you need to analyze two factors simultaneously, a one-way analysis of variance (ANOVA) is not enough. That’s when you need to perform a two-way ANOVA. Doing the analysis manually can take quite some time, however. This article shows you how to do it with the help of Microsoft Excel.

This is a tutorial extension of Using MS Excel for One-way Analysis of Variance. In this tutorial, you will learn how to perform two-way analysis of variance in MS Excel.

This is a more complicated analysis as compared to one-way ANOVA because it will enable you to investigate two factors simultaneously in a single experiment.

In the article linked to above (for the one way ANOVA), only one factor is investigated to determine whether it has an effect on the response (e.g. electrical resistance). A two-way factor analysis of variance is of vital importance in engineering, information technology and the academic/research field. Instead of doing two independent one-way ANOVAs, more information and efficiency is gained from doing a single experiment that investigates the effects of the two factors.

A two-way ANOVA is the simplest form of experimental design covered in more advanced topics of inferential statistics called Design of Experiments (DOE).

{mospagebreak title=Best practices for two-way analysis of variance}

There are at least six things that you need to know and keep in mind before you can conduct a two-way ANOVA. First, has the data that was gathered been replicated? This is not necessary, but highly recommended. Say for example that you are studying the effects of two factors (material type and treating laboratory) on the thickness of a coating. Then you might gather samples three times in each laboratory for a specific material, because replication increases the accuracy and precision of the experimental results.

Second, the data must be quantitative. A two-way ANOVA is the statistical/quantitative analysis of means, so make sure that the data gathered are purely numerical. Using decimals can add precision and accuracy. Up to three places is good for a tight comparison — for example: 1.323, 1.456, 2.343. A “YES” or “NO” answer for gathered data cannot be used in a two-way ANOVA. However, there are devices that precisely provide this level of accuracy (example: analog- based devices).

Third, bear in mind that it is not good enough to have measurable and quantifiable responses; you also need to ensure that the data gathered is accurate and precise. Confirming the accuracy, consistency or precision of a measuring device is covered in a separate statistical topic called MSA (Measurement Systems Analysis), which is not discussed in this article.

Fourth, you should use a two-way ANOVA when there are two factors involved, and you are interested in finding out what factor is significantly affecting the response.

Fifth, by using a two-way ANOVA, it is possible to confirm whether the interaction of two factors affects the response. This cannot be done using a one-way analysis of variance.

Finally, to get the most out of this article, you should have MS Excel installed in your computer with the Data Analysis Toolpak enabled.

{mospagebreak title=Case Study Example: Evaluation of New Coating}

To illustrate this analysis using MS Excel, we will present a real example which is solved by a traditional method (using manual computation). We’ll discuss this two-way ANOVA example of the evaluation of a new coating.

Problem Statement:

An evaluation of a new coating applied to three different materials was conducted at two different laboratories. Each laboratory tested three samples from each of the treated materials. The results are given in the next table:

It would be interesting to know which factors influence the coating’s thickness.

However, the entire solution presented on that page is manually computed. You are going to solve the above problem by using MS Excel and adopting the scientific method:

Step 1: Formulate the null hypothesis of the study. In a two-way ANOVA there are three null hypotheses which you need to formulate.

First null hypothesis: There is no significant effect from the “material” type factor on the response. This is like doing a one-way analysis of variance on the Materials (column) factor. Rejecting this null hypothesis means that the “material” type factor is significant.

Second null hypothesis: There is no significant effect from the “Laboratory” factor on the response. Rejecting the null hypothesis means that this factor is indeed significant.

Third null hypothesis: The interaction between Materials and Laboratory do not affect the level of the response; if this is rejected, it would come out as significant.

Note: For the three hypotheses, a 95% confidence level is used. This means you can be sure that 95% of the time the result is actually correct. And it has a 5% risk of arriving at a false conclusion. In a p-value analysis, if the p value is less than 0.05, you reject the null hypothesis and conclude that the factor is significantly affecting the response. Otherwise, if it is above 0.05, you accept the null hypothesis and the factor has no effect after all.

Step 2: Encode the following data into an Excel worksheet, which should look like the screen shot below:

Excel is very strict with how the data must be arranged on the worksheet. If it is not entered in the way Excel suggests, it won’t produce analysis results; or, if it gives results, they may not be entirely correct or accurate.

Step 3: Launch MS Excel and go to Tools -> Data Analysis -> ANOVA: Two-factor with Replication, and then click “OK.” If you cannot see this in your MS Excel, then the analysis toolpak add-in is not installed.

Step 4: Select the area in the input range (see screen shot below). Since there are three replications/repetitions involved, input “3” in the rows per sample. Set the “alpha” to industry standard, which is 0.05 (for a 95% confidence level, “alpha” is 1-confidence level = 1- 0.95 = 0.05).

You can customize the output options. In the screen shot below, the completed analysis will be shown in the same worksheet. However, you can choose to place the results in a new worksheet or even in a new workbook.

Once you press OK, the analysis will automatically be generated by Excel. There is no need to compute those values manually.

{mospagebreak title=Interpreting the results}

Successful experiments require correct interpretation of results. By default, Excel will provide the two sets of results, the descriptive stats summary and the ANOVA table. The ANOVA table is the most important result.

This is where the results can be a little confusing. You cannot see the name of the factors in the table; instead, what you see under “source of variation” is as follows:

a. Sample

b. Columns

c. Interaction

The “Sample” stands for the row factor, which is the “Laboratory;” the “Columns” represents the “Material” type factor; and "Interaction" is the combination of two factors, “Laboratory x Material.”

Since ANOVA is a comparison of means using analysis of variances, the most important column in the ANOVA table is the P-value. Based on the above ANOVA table screen shot, it says that the two factors (Laboratory and Material) are indeed a significant factor affecting the response (coating thickness) because the P-value is below 0.05. The null hypothesis of each of those two factors will be rejected because of this result.

The interaction (Laboratory x Material) is above the 0.05, and the null hypothesis will be accepted; it means that their interaction is not a significant factor contributing to the response “coating thickness.”

By using the example stated in this tutorial, any analyst, engineer, researcher or scientist can analyze more complicated experiments by following the fundamentals of analysis and interpretation of a two-way ANOVA using MS Excel.

You can download the sample workbook used in this tutorial.

[gp-comments width="770" linklove="off" ]