What is Reliability?
Reliability refers to the consistency of measurement, that is, how consistent are evaluation results from one measurement to another.
Reliability is concerned with the extent to which an evaluation test is consistent in measuring what it is intended to measure.
If the measurements are not consistent over different occasions or over different samples of the same performance domain, the evaluator can have little confidence in the results.
A test scores would be reliable when there is good reason to believe that it is stable and trust worthy.
These characteristics will depend on the extent to which the score is free from chance error. It is to be expected that the same test which is repeatedly administered on the same group of individuals, should yield the same pattern of scores.
Methods of Reliability
Types of Reliability Tests
The 4 methods of estimating reliability are
- Test-Retest Method
- Equivalent Forms Method
- Split Half Method
- Kuder Richardson Method
Test-Retest Method
In this test, the same tool or instrument is administered to the same sample on two different occasions.
The resulting test scores are correlated and the correlation coefficient provides a measure of stability over a given period of time.
If the results are highly stable, those respondents who are high on one administration of test will also be high on the other administration and the other respondents tend to remain in their same relative positions on both administrations.
An important factor to be kept in mind is the time interval between tests when interpreting measures of stability.
- If the time interval is short (say 1-2 days), the consistency of results will be inflated because respondents will remember some of their answers from the first test.
- If the time interval is quite long (say 1 year), the results will be influenced by the actual changes in the respondent over that period of time.
Therefore, the best time interval between test administrations will mainly depend on the use to be made of results.
Equivalent Forms Method
This method uses two versions of an instrument given to the same sample of respondents.
The two forms of the instrument are administered to the same group of respondents in close succession, and the resulting scores are correlated. T
he correlated coefficient provides a measure of equivalence. It indicates the degree to which both forms of the test are measuring the same aspects of behavior.
The equivalent forms method reflects short term constancy of respondents' performance and the extent to which the test represents an adequate sample of the characteristics being measured.
Split Half Method
Reliability is also estimated from a single administration of a single form of a test. The test is administered to a group of respondents in the usual manner and then is divided in halves for scoring purposes.
To split the test into halves that are most equivalent, the usual procedure is to score the even numbered and the odd numbered items separately.
This produces two scores for each respondent, which, when correlated, provide a measure of internal consistency.
A reliability coefficient is determined by correlating the scores of two half-tests.
The split half method is similar to the equivalent forms method in that it indicates the extent to which the sample of test items is a dependable sample of the content being measured.
A high correlation between the scores on the two-halves of a test denotes the equivalence of the two-halves and consequently the adequacy of the sampling.
The advantage of this method is that all data for calculation of the reliability coefficient can be collected in one sitting thereby avoiding variations due to two sessions.
Kuder Richardson Method
Another method of estimating the reliability of test scores from a single administration of a single form of a test is by means of formulas developed by Kuder and Richardson.
These formulas provide a measure of internal consistency as with the split-half method but do not require splitting the test in halves for scoring purposes.
Kuder-Richardson estimates of reliability provide information about the degree to which the items in the test measure similar characteristics.
For a test with relatively homogeneous content, the reliability estimate generally will be similar to that provided by the split half method.
In fact, Kuder-Richardson estimate can be thought of as an average of all of the possible split half coefficients for the group tested.
It is an advantage when considering tests with relatively homogenous content since the estimate does not depend on the way in which the items are confined to the two-half test as in the split-half method.
However, for tests designed to measure more heterogeneous learning outcomes, the Kuder-Richardson estimate will be smaller as compared to split half method and the later method is to be preferred.
Post a Comment