Doing Statistical Analysis with MS Excel

MS Excel can be used for doing statistical analysis. That much is common knowledge, yet few people seem to know how to do it. If the very thought of using MS Excel in this way seems challenging to you, keep reading. This article will walk you through performing two kinds of important statistical analysis tasks with this software.

Contributed by
Rating: 4 stars4 stars4 stars4 stars4 stars / 9
June 15, 2009
Rate this Article:
MEH MEH++


SEARCH ASP FREE
TOOLS YOU CAN USE

advertisement

Doing statistical analysis using MS Excel seems to be difficult for a beginner and even an intermediate user of MS Excel. The primary reason is that it is not taught in most MS Excel classes or even in the in-house seminars. Also, doing statistical analysis requires you to learn the associated Excel functions in order to perform the specific statistical analysis task.

In this article I will illustrate how to perform the most important statistical analysis using MS Excel. Below are the important statistical analysis tasks which scientists, engineers and even those in the financial sectors should be performing:

  • Making a histogram

  • Performing a correlation plot and regression analysis

Some of the MS Office packages do not include the magical Data Analysis tool pack. This makes it even harder for an analyst to do some basic or advanced statistical analysis. Data Analysis is an add-on to MS Excel, but some installations might not include it. If yours doesn't, obviously, attempting to use the Data Analysis tool pack will not work. 

Fortunately, the examples illustrated in this article do not depend on the Data Analysis tool pack. You will not need it to get them to work; any analysis can be done without the tool pack. This tutorial has been tested to work in MS Excel 2002.

Let us start with making a histogram.

How to make a histogram using MS Excel

A histogram is a plot that shows the frequency of data sets. This is an important tool used to analyze the distribution of data. By graphically analyzing the histogram, you can tell where the mean may be located. You can even tell the variation of the samples, or if the data are skewed to the right or left instead of following the normal curve (the mean at the center of the distribution).

To do this using MS Excel, say for example that you have the following data:

37.2, 37.8, 37.1, 37.8, 36.2, 36.5, 36.6, 36.3, 36.5, 36.6, 36.2, 36.9, 37.1, 37.6, 37.3, 36.1, 36.5, 36.3, 35.8

Step 1. First, open your MS Excel, and then position/copy the data above vertically, starting on cell A1 with data = 37.2 and continuing to cell A20 with data = 35.8

Step 2. Compute the minimum and maximum values of the data set. To do this, in cell A21, paste this formula: =MIN(A1:A20), then on cell A22, paste this formula: =MAX(A1:A20)

The minimum and maximum values of the data set are needed to make a data bin for the histogram.

Step 3. Compute the bin statistics. The bin statistics are the values required to make the graph.

Increment value = (Max value - Min value) / Number of samples

Starting value = (Min value - Increment value)

Ending value = (Max value + Increment value)

In this example, you can use cell A23 for the increment value, cell A24 for the starting value and finally cell A25 for the maximum value.

The starting value should be less than the minimum value of the data, so that the minimum value can be seen in the plot. The same logic applies to ending value. In this example, the following are the computed values:

Minimum value = 35.8

Maximum value = 37.8

Increment value = 0.1

Starting value = 35.7

Ending value = 37.9

Making a histogram continued

Step 4. Make the bin data table using the bin statistics. Since the actual data table has been placed in Cell A1 to Cell A20, the bin table should be placed starting on cell B1 to B23. See below:

35.7

35.8

35.9

36

36.1

36.2

36.3

36.4

36.5

36.6

36.7

36.8

36.9

37

37.2

37.3

37.4

37.5

37.6

37.7

37.8

37.9

The bin table should be arranged like the one above.

Step 5. If you are done with the bin table, you can compute the FREQUENCY. This will count the number of data occurrences within the specific bin data range. In our example, use cell C1 to place this frequency formula:

=frequency(A1:A20,B1:B23)

Then select (selecting the cells will highlight the cells) starting with cell C1 and going down to cell 23. Then follow the steps below:

  • Press F2. 
  • Press Ctrl - Shift (together). 
  • Press Enter.

The procedure above will produce the frequency data table in column C. Look at the screen shot below; your analysis in your Excel spreadsheet should be similar (particularly in column C where the frequency data table is computed).

We've almost finished; we still have to plot the frequency vs. the actual data bin table.

Step 6. To generate the histogram plot, follow the simple steps below:

  • Go to Insert -> Chart -> Column (this is the chart type), and then click NEXT.   
  • Click the "SERIES" tab, and remove Series 1 and Series 2, because they are redundant data in the histogram.   
  • Click on the "Category (X-axis)," and then select B1 to B23, which is the bin range data.   
  • Finally, for the name, you can change it from "Series3" to "histogram."

Now you have the histogram plot generated.

How to make a correlation plot and regression analysis

A correlation plot/regression analysis is another important data analysis tool that can determine the degree of relationship between two variables. For example, say you have the following data:

We would like analyze the relationship between Data1 and Data2.

Below are the steps you need to take to make the correlation plot:

Step 1: Paste/Construct the data table above, starting in cell A1 to cell C11.

Step 2: Click "Insert" -> "Chart" -> "XY Scatter," and then click Next.

Step 3: In the Data range, select cells B2 to C11, and then click "Finish."

The scatter plot will now be generated. To find out the model of the regression analysis plot, click on any dot in the scatter plot, right click on it, and then click "Add Trend line."

Step 4: Finally, select any regression type of your choice. For example, we will select the "linear" type regression. To display the model/equation in the graph, click the "Options" tab, and then click "Display Equation on Chart" and also "Display R-squared value on chart."

To interpret the analysis plot, R squared (in statistics this is called coefficient of determination) of 0.989 means that Data 1 and Data 2 are 98.9% related (this percentage indicates how well they fit each other). 

Conclusions

MS Excel is a highly powerful spreadsheet that offers statistical analysis tools to make histograms and perform correlation/regression analysis. The above tutorial shows that it can be done without installing the Data Analysis tool pack.

The data used in this article was purely for example purposes; you can have the data on your own and make your own statistical analysis. In upcoming articles I will discuss another set of statistical analysis such as comparing the means of two data sets as well the variation.

If you are in the engineering/manufacturing sector, or a beginner in the field of statistics, you will find this tutorial very helpful.

blog comments powered by Disqus
BRAINDUMP ARTICLES

- Microsoft Windows 8 Committed to Cloud Compu...
- Independent Developers Favor Windows Phone 7
- Dell Introduces VMware-based Cloud
- Microsoft and Skype Agree to Acquisition Deal
- Transfer Contacts in Microsoft Outlook
- Zune`s Next Steps
- Safari Books Online Review
- Does Microsoft Get Touch Screens Now?
- Microsoft`s Record Quarterly Earnings Not En...
- Basic Operations and Registers in Assembly
- Assembly Coding within Visual C/C++ IDE
- New Microsoft Office Coming with a Twist
- Microsoft`s FUSE Labs Unveils Spindex Social...
- HP Slate with Windows 7: Dead or Alive?
- Windows Phone 7 Mobile OS to Rival Android a...

ASP Web Hosting ASP.Net Web Hosting Windows Web Hosting
ASP Free Forums 
 RSS  Tutorials RSS
 RSS  Forums RSS
 RSS  All Feeds
Site Map 
Request Media Kit
Write For Us Get Paid 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Privacy Policy 
Support 


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 2 - Follow our Sitemap
Most Popular Topics
All ASP.Net Tutorials