Ensuring Test Reliability
According to classical test theory, test reliability is based on the notion that test score reliability is comprised of two parts: true scores and error. A true score is an expected score on a test over an infinite number of testing instances; it is a theoretical idea that can never be known for sure. Errors are inaccuracies that make actual (observed) test scores differ from true scores.
There are several different ways to measure a test’s reliability. Test-retest reliability looks at the correlation between original test administrations and re-tests. The span of time between the two administrations should be less that the time for the true scores to vary. Test-retest reliability looks at error due to time.
Alternate-form reliability looks at the correlation between two different versions of a test. Split-half reliability is similar to alternative-form reliability, splitting a single test into two halves—usually odd and even items—and correlating scores on the two halves. Cronbach’s coefficient alpha is also similar, essentially providing the average of all possible split-half reliabilities. Alternate-form, split-half, and Cronbach’s coefficient alpha all look at error due to content sampling.
For this Discussion, select three test reliability methods that could be used in employment test development. Consider advantages and disadvantages of each method.
With these thoughts in mind:
Post by Day 4 a description of the three test reliability methods you chose. Explain advantages and disadvantages of using each method in employment test development. Be specific and provide examples. Support your response using the Learning Resources and the current literature.
Be sure to support your postings and responses with specific references to the Learning Resources.
Read a selection of your colleagues’ postings.
*****References if needed
READINGS
- American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
- Chapter 2, “Reliability and Errors of Measurement”
- Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78(1), 98–104.
Retrieved from the Walden Library databases. - Loevinger, J. (1954). The attenuation paradox in test theory. Psychological Bulletin, 51(5), 493–504.
Retrieved from the Walden Library databases. - Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8(4), 350–353.
Retrieved from the Walden Library databases. - Smith, G. T., McCarthy, D. M., & Anderson, K. G. (2000). On the sins of short form development. Psychological Assessment, 12(10), 102–111.
Retrieved from the Walden Library databases. - Wainer, H. (1986). Can a test be too reliable? Journal of Educational Measurement, 23(2), 171–173.
Wainer, H., Can a Test Be Too Reliable? In Journal of Educational Measurement. Copyright 1986 Blackwell Publishing Journals. Used with permission from the National Council on Measurement in Education via the Copyright Clearance Center.
Optional Resources
- Cattell, R. B. (1986). Dodging the third error source: Psychological interpretation and use of given scores. In R. B. Cattell,& R. C. Johnson (Eds.), Functional psychological testing: Principles and instruments. New York, NY: Brunner-Mazel.
- Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334.
Retrieved from the Walden Library databases. - Loevinger, J., Gleser, G. C., & Dubois, P.H. (1953). Maximizing the discriminating power of a multiple score test. Psychometrika, 18(4), 309–331.
- Sireci, S. G., Thissen, D., & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28(3), 237–247.
Retrieved from the Walden Library databases.