Improving Construction of Statistical Process Control Charts
In the first article, you learned how to manually construct statistical control charts to monitor and detect problems pertaining to website traffic. This procedure has certain disadvantages. In this article, we'll tackle those problems by using macros.
Contributed by Codex-M Rating: / 5 December 15, 2009
The manual procedure has three major disadvantages.
First, it cannot be updated in real time. Therefore, to update the data, any webmaster or analyst needs to log in to Google Analytics first in order to fetch results.
Second, there is a higher chance of computational error since it is done manually. Computational errors are both risky and costly to any process.
Third, it takes a lot of time to construct control charts. This is because of the lengthy processes involved that require exporting of data from Google Analytics to Excel/CSV format, as well as the time-consuming computations that must be performed.
However, the main advantage is security. The shortcut method involves running macros on your MS Excel. There are some companies which do not allow macros to be run on their Windows-based machines. Since you are using the manual method, you can be sure that there are no macros executed which have been created by other parties.
Disclaimer: the macro developed to shorten the process of statistical process control chart creation has been thoroughly checked by us to be virus-free. Even so, it is your responsibility to double check the scripts and ensure that the file has been scanned before running macros. We are not liable for any negative consequences that may result from running macros.
The traditional method of working with Google Analytics in an MS Excel environment is to export the result first and then work with it offline (like the manual method discussed in the first part).
The good news is that it is possible to fetch real time data from your Google Analytics account directly to an MS Excel worksheet without going through the hassle of exporting the CSV, formatting to Excel, cleaning the data and computing the control limits.
The official Google Analytics blog has discussed API methods that can be used to accomplish our objective; one of these involves using VBA Macros.
VBA macros are written in VBA script. They are the native programming language that can be used by a developer to create complex applications using MS Excel. One of these very important functions relevant to the creation of statistical process control chart involves creating API request/communication between MS Excel and Google Analytics.
The down side in a VBA macro environment is the possibility of getting infected with a virus; such an environment can also be used to inject malicious macros created by programmers to infect your computer. Nevertheless, it is still an excellent MS Excel feature.
A developer by the name of Mikael Thuneberg has created an MS Excel macro application (at the link) to communicate with Google Analytics automatically.
This tool enables one to plot charts; however, it is NOT an SPC chart. A lot of work still needs to be done to transform this sheet into a control chart.
After downloading, follow the rest of the procedures below.
Step 1: Check the downloaded filename. It should be:
Google Analytics- Statistical Process Control Chart.xls
Step 2: Open the file. If there is a macro warning message, click “Enable macros.” Macros are required to run this application.
Step 3: Enter your Google Analytics “email address” in the login.
Step 4: Enter your Google Analytics “password” and then press the enter key. The password will not be shown, but if your login is correct, it will show “Authenticated” below it. See the screen shot below:
Step 5: All of your website profiles are present under “Profiles.” Select a specific website profile for which you wish to plot a control chart. You can select by using the drop down menu.
Step 6: Now, as discussed in the first tutorial, you need to create the control limits first, which are based on one month of historical data. So in this step, select the start date and end date. Using the example in the first tutorial (manual method), the dates will be:
Start date: 10/7/2009
End date: 11/7/2009
Note: Dates should be formatted as: Month/Day/Year
These concern the one month historical period covering the current period, which is November 7, 2009 to December 7, 2009.
Step 7: Click “Fetch Data.” The VBA Macro in MS Excel automatically communicates with the Google Analytics API to fetch your data back to MS Excel. You should see the data shown in the “graph” worksheet. These are the one-month historical data.
Step 8: Since historical data is now available, click “Compute Control Limits.” Excel automatically computes the SPC Control limits.
The screen shot above shows the computed control limits after clicking the “Compute Control Limits” button.
Step 9: Now you need to fetch the current data. In this example, I will monitor the traffic starting the month of December 2009 (using October to November 2009 historical data). To do this, change the start and end date to reflect the current month.
Start date: 12/1/2009
End date: 12/9/2009
Finally, click “Fetch Data” to get the December 2009 Google Analytics traffic data to Excel automatically.
Step 10: To see the statistical process control chart (showing the UCL, LCL, Average and the December 2009 traffic data), go to the “SPC Control Chart” worksheet.
Step 11: The process of building your control charts based on historical records is now complete. You are now ready to monitor it for “out of control situations.”
Included in that sheet is vital information for out of control situations as well as information on what should be done in case a problem occurs.
Below is the December 2009 control chart for a sample website (ignore the drop lines on the 10th day, since the scope of data collection is from December 1, 2009 to December 9, 2009).
You noticed that an out of control situation occurs: More than eight consecutive points below the mean. Based on the rules for out of control situations, the first step is to identify the root cause of the event, and then implement corrective actions.
It has been found that the site is no longer featured on the home page of two high traffic sites, which have been around for the past three months. Since historical data is computed at the time of a high rate of referral traffic, we now start to see a significant drop in website traffic beginning in December 2009, when the site is no longer featured.
Below is a sample screen shot of stable website traffic (SPC control chart):
The traffic usually swings above and below the center line due to “normal” variations.