Statistically Comparing Two Alternatives
David J. Lilja, Ph.D., P.E.
Course Overview
Due to measurement
noise, it often can be difficult to determine whether a change made to a system
had any real impact. Measurement noise also makes it difficult to compare the
performance of two different systems. In this course, you will extend your understanding
of confidence intervals to determine if there is a statistically significant
difference in the performance of two alternatives. You will learn about both
before-and-after comparisons and comparisons using unpaired measurements.
This course includes
a multiple-choice quiz at the end, which is designed to enhance the understanding
of the course materials.
Learning Objective
After
completing this 2-hour course, you will be able to:
Reading Assignment
The reading assignment for this course is Chapter 5.1 of Measuring Computer Performance: A Practitioner's Guide, David J. Lilja, Cambridge University Press, 2000.
If you don't have this book, you can purchase Chapter 1 in PDF format online at eBooks.com for a modest cost. The price for this course listed on this website does not include the cost of purchasing the chapter through eBooks.com. However, the price has been reduced to compensate for the cost of purchasing the chapter required. If you plan to take all 6 courses (E132 to E137) based on this book, you may consider to purchase a hard copy of the book or the entire book in PDF format online through eBooks.com.
Key Terms
Study
Notes
Errors (noise) in a set measurements introduce uncertainty into the results you can obtain from those measurements. If you are trying to compare the performance of two different computer systems, for instance, the noise in your measurements may obscure whether or not there is a real difference between the systems.
In this course, you will learn how confidence intervals can be used to determine if there is a statistically significant difference between two systems in spite of the noise in your measurements.
To make this type of comparison, you typically would take several measurements of the execution time of the same benchmark program running on two different systems. You then find a confidence interval for the difference of the means of each set of measurements. If the confidence interval of this difference of the means includes 0, you must conclude that there is not a statistically significant difference between the two systems, at the given confidence level. This approach is quite general and can be applied even when you do not have the same number of measurements on each system. There are a few small things you need to be careful of when applying this technique, such as how to compute the standard deviation of the difference of the means and how to determine the effective number of degrees of freedom.
If you know that there is some direct correspondence between individual measurements of the two different systems, you can apply the before-and-after comparison technique. This technique is most appropriate when you are trying to determine if a change you have made to a system has any statistically significant impact. For instance, you may be trying to determine if adding more memory to a system produces a statistically significant change in performance. This technique is very similar to the one described above, except that you compute a confidence interval for the mean of the differences of each pair of measurements. This technique is quite straight-forward after you understand the basics of confidence intervals. It does require that the same number of measurements be made on each before-and-after configuration of the system, though.
Finally, a similar type of confidence interval approach can be used to determine if there is a statistically significant difference between two proportions.
Quiz
Once
you finish studying the
above course content,
you need to
take a quiz
to obtain the PDH credits.