Already 2010 is feeling like the year of optimization. Everywhere I look, I’m seeing conversations about A|B and MVT testing, optimizing conversion flows, and understanding statistical significance.
When I first started running A|B tests, everything I did was on faith. I had good intention, I measured all the key indicators, but I had no idea how to tackle the question of “yeah but, is it statistically significant?” Then I began to crawl as I experimented with online calculators and eventually I moved on to building out my own formulas in Excel but still there was little confidence in myself, let alone the test results.
Eventually I began to experiment with testing tools like Google Web Optimizer, Amadesa, and Omiture Test & Target. This seemed to make life so much simpler as all the questions I was being asked were answered right in the testing application. Is it significant? Amadesa says they are 98% confident in the results. What is the lift we are seeing? Google Web Optimizer says its 8.5% and as a bonus it gives the confidence interval.
While I think it is extremely valuable to have your testing and optimization platform provide the key statistical measures that relate to your test, I think it is just as important to understand the math behind the reports, after all, you can’t call yourself a “car guy” or a “car girl” if you drive on the gauges alone and you don’t understand how the underlying systems work.
Let’s walk through an example campaign to understand how Omniture Test & Target calculates the statistics behind the results.
For our campaign, lets assume the following facts:
- Our campaign has two treatments, a control and one alternative.
- The control has had 4,008 visitors
- The alternative has had 4,003 visitors
- The control has had 377 conversions
- The alternative has had 355 conversions
#1 – Conversion Rate

Conversion rate equals the number of conversions divided by the number of starts, in this example we are using visitors but this can be visits, impressions, unique starts, etc. depending on how you measure site conversion.

Conversion Rate (control) = 377 / 4,008 = 9.41%
Conversion Rate (alternative) = 355 / 4003 = 8.87%
#2 – Standard Deviation
Standard Deviation shows how much variation (measures the spread or dispersion of a set of data) there is from the “average” (mean). As conversion rate is a binomial distribution, either a visitor converts or does not convert, the binomial distribution for variance is used:

Variance (control) = 9.41(1 – 9.41) = 0.09
Variance (alternative) 8.87(1 – 8.87) = 0.08
To calculate Standard Deviation from the variance, we take the square root of the variance:

Standard Deviation (control) = SQRT(0.09) = 0.29
Standard Deviation (alternative) = SQRT(0.08) = 0.28
#3 – Standard Error
The Standard Error is the estimated Standard Deviation of the error; the “noise” in the result. The Standard Error is calculated in order to calculate to Signal-to-Noise ratio.
To calculate the Standard Error for the Control:

Standard Error (control) = SQRT(0.09 / 4008) = 0.005
To calculate the Standard Error for the alternative:

Standard Error (alternative) = SQRT((0.09 / 4008) + (0.08 / 4003)) = 0.006
# 4 – Signal-to-Noise Ratio
To calculate the Signal-to-Noise ratio:

Signal-to-Noise = (9.41 – 8.87) / 0.006 = 0.84
OK….stay with me…..we are almost there.
#5 Finally We Arrive At Confidence
We will make use of the Signal-to-Noise ratio to calculate confidence using the Student’s T-Test.

Student’s T-Test = 1 – TDIST(0.84,(4003 + 4008 -2),2) = 0.60

As reported by Test & Target, we are 60% confident in the current results.
Extra Credit: Confidence Intervals

The Confidence Interval shows how much your test results can vary and still be within a predetermined confidence level. Standard confidence levels are 90%, 95%, 99%, and 99.5%. Omniture Test & Target uses the 95% confidence level.
To calculate the Confidence Interval:

Confidence Interval = 1.96(0.28 / SQRT(4003)) = 0.008
1.96 is a constant in this formula. 1.96 is equal to z*, which is taken from a Standard Normal Critical Values table based on 95% Confidence Level. The Standard Normal Critical Values Table can be found in any introductory level statistics book.
Now that we have determined our Confidence Interval, we can calculate the +- of our test results:
High Bound = 8.87% + 0.008 = 9.75%
Low Bound = 8.87% – 0.008 = 7.99%
Giving us the Confidence Interval as reported in Test & Target of 7.99% to 9.75%, meaning given the current volume, we are 95% confident that our conversion rate will fall between 7.99% and 9.75%.
The formulas in this post have been provided by Omniture consulting. The screenshots have been taken from Omniture Test & Target and have been modified for the purpose of this example.
Guest Blogger: Dan Roden
Web Analytic Strategy: Moving Forward with First Downs
Having been in the WA space for seven years now, both on the vendor side with Omniture and now on the client side managing a web analytics group, I have seen a broad spectrum of business requirements, implementations, reporting needs and analysis. I remember working with a major retailer and their massive pre-implementation documentation that outlined seemingly every link on their site, a credit card and financial services company had a wide range of requirements for every facet of their business, and a leading media company that was forging ahead as an early adopter of video tracking (long before the OMTR video tracking offering) and so forth. Though all of them had the common goal to augment their businesses with web behavioral knowledge, each had different ideas of what needed to be done first. Some argued that they should focus on one portion of their business as a pilot; others wanted everything ready to go live all at once. Many discussions (some of them louder than others) were had about which method of deployment and measurement would be more beneficial and valid arguments were made for either side. However, the greatest hindrance to success was thinking that the entire project (from business requirement collection >> implementation >> report distribution >> analysis) could be swallowed whole…one big gulp and then belch out results.
For all who tried to eat the proverbial elephant in one bite, it was a complex and frustrating venture. Implementations had dizzying logic to account for complicated scenarios and it was nearly impossible to validate completeness once in QA or in production. Each business pushed for their key reporting to be ready first, analysts were interpreting data incorrectly since they did not understand the context in which the data was collected etc. This of course led to great dissatisfaction for many and when the smoke cleared, all that was left was a massive, tangled ball filled with duct tape fixes and shortsighted solutions. It took months upon months to go back and straighten bent nails, re-hang doors sticking doors and touch-up painted walls in their house of analytics that was built in haste. Worse yet, the business had grown weary of the data that was being produced or the way it was interpreted. Even basic reporting was in doubt as many questioned the completeness of the implementation.
I understand that businesses live in a world where time is seemingly running at two-times the normal pace and I also understand the political nature behind priorities for projects. I realize that the business world is not a perfect world and therefore write the remainder of this article with the intended purpose of showing the value of advancing your analytic ball, ten yards at a time (apologies to those who don’t know the game of football very well).
In football, fans love to see exciting plays: The Hail Mary, Flee-Flicker, Double Reverse and kick returns for touchdowns. There is nothing exciting about a four yard run off-tackle. A slant pass to the slot receiver for seven yards doesn’t raise people out of their seats. But from a strategy standpoint, there is nothing more frustrating for defensive players and coaches than playing against on offense who can consistently move the ball down the field…one first down after another. In your web analytic practice, how well are you advancing the ball? How many projects do you start and actually complete fully? When your projects are complete, are they everything you envisioned before you started? What is the project lifecycle from business requirement to implementation to reporting to analysis?
So many companies that I have worked with don’t want to hear about four yard plays, they want every play to be a long pass. They want to know “everything about everything”, they want to know how many people with blue eyes were standing on one foot when they opted out of a purchase flow. What do you do about that? When I get a request for “tracking” and the request comes with no requirements or the all encompassing “I need to know everything about everything” request, I know that zero thought has been put into the reporting of the project and therefore, I must step in and manage the game.
When I look at everything on my Work Stack, it would be very easy for me to get discouraged. Every play call from the sideline seems to be a post pattern 30 yards down field. However, I have no problem making a few audibles at the line of scrimmage. Here are three things that have helped me move our analytic practice consistently forward that may be of service to you:
• Implementation completion: This is your foundation! How confident are you in your implementation from pageNames to merchandizing eVars on a scale of 1-10? If you are not at a 9-10, you are not in a good place. Take some time to review your implementation and document the hell out of it. When analyzing your data, any doubt about how it was produced is a showstopper. I highly suggest you leverage an automated service to crawl your site(s) to provide you reporting on which pages have outdated code, variables and logic. The best solution for the money that I have seen is from ObservePoint.
• Ambiguity has no place in requirements: Never accept the phrase “need tracking” as the sole requirement from the business. At the end of the day, when the business comes back and asks pointed questions (which they will), you will be responsible to point them to the answers. I suggest that you kindly reply with some “base metrics” that will be available with your current implementation but put the onus back on the business to define specific reporting/analysis requirements. Nothing will derail your data reputation faster than being able to only answer 30% of the questions that are sure to come. Additionally, every project will seemingly drag on and on, as you have to repeatedly update your implementation to answer the requirements that trickle in over time.
• Help the business understand the value of the four yard run on first down: When you get a project of sizable proportion, with requirements so complex that you don’t even know if its possible, take the time to break the request down into controlled, manageable parts. The goal is still to score a touchdown, but explain the value in doing it with a series of high percentage plays. For example, make sure the base metrics are implemented correctly first, then move to interaction tracking, events and eVar expiration. Once those are confirmed to be implemented and reporting correctly, move to correlations/sub-relations, classifications and automated report delivery. Architect a design solution that shows the business milestones and what will be available at each, let the business see that even though the progress is methodical, it will lead them to exactly what they are hoping for, a touchdown!