The Uncertainty: Validity and Reliability

Uncertainty is an issue we have to deal with in scientific measurements. Science are methods for effectively and efficiently doing things. This sounds great. However, scientific methods are always improving, can never be perfect.   

The uncertainty principal in physics states that it is impossible to measure or calculate the position and the momentum of a quantum particle with absolute precision. There is always some degree of uncertainty, and the more the position of the particle is determine, the less is known about the speed of the particle, and vice versa. This uncertainty is set in the quantum mechanics world, dealing with the smallest scales of nature, quantum particles, such as atoms and subatoms.   

The uncertainty in measurement in our physical world, the world we see around us, addresses that the measurement we determine cannot be exactly the true value of the object we measure. There is always some difference between the measurement and the true value. We use accuracy to describe how close our measurement to the true value. We can use a length measurement to demonstrate this. We can use three rulers, which have different scales, to measure the same blue line as shown in the figure below. The length of the blue line is the true value, which we want to determine. Ruler A has the smallest division as 1dm. The blue line is landed between 0-1dm. Since Ruler A does not have scales between 0 and 1dm, we have to estimate based on the known scale, say around 0.6dm. If we use Ruler B, which has the smallest division as 1cm, we can see the blue line is landed between 6cm and 7cm. We can read the blue line is longer than 6cm, but shorter than 7cm. However, since there is no further scale between 6cm and 7cm, we still need to estimate, say around 6.2cm. If we use Ruler C, which has the smallest division as 1mm, the measurement gets improved because of further division on the ruler. We can read the blue line is landed between 62mm and 63mm, but we still need to estimate the last digit, say around 62.5mm. This is the situation with our measurement using devices with scales. We can have smaller scales, and have more accurate measurement, i.e. close to the true value. However, we only can do the division to some extend, we always have to estimate the last digit. The measurement we get is determined by comparing to a standard scale. It is close to the true value, but it is never the true value. The difference between the measured result and the true value is called error. In measurement, to get more accurate results, we need to make sure the devices/instruments we use have standardized scales. To ensure the measured results, we can repeat the measurement. If conducting multiple measurements, we get similar results, we say the measurement is precise, i.e. the measurement can be reproduced. The standard deviation is used to evaluate precision. The uncertainty can be big or small, but we cannot get rid of the uncertainty in our measurement.         

(A length measurement)

The measurements in metric system is straightforward. If we want to measure more complex constructs, such as confidence, efficacy, and determination, it is more complicated. Reliability and validity are used to evaluate the construct measuring methods. Reliability is like the precision, dealing with how consistent a measurement is, i.e. whether the measurement can be reproduced under the same condition. Validity is like the accuracy, i.e. whether the measurement result really represents what it is supposed to measure. The uncertainty persists in all measurements. Reliability (random errors) and validity (system errors) are two facets in the measurement uncertainty. We want to know how reliable and how valid our procedure/method is. The validity and reliability need to be evaluated for any measurement. In statistic aspects, means evaluate validity and standard deviations evaluate reliability. 

The methods of reliability evaluation:

  • Test-retest reliability: the consistency in measurements of the same construct administrated to the same sample at different time points (the same test over time).
  • Inter-rater reliability: the consistency in measurements conducted by two or more independent raters (the same test conducted by different people).
  • Internal consistency reliability: the consistency in measurements of different items of the same construct (the individual items of a test).
  • Parallel forms reliability: the consistency in measurements of two equivalent versions of a test.           

The methods of validity evaluation:

  • Construct validity: whether a test is actually measuring the construct it claims it's measuring. 
  • Content validity: whether a test is representative of all aspects of the construct.
  • Face validity (subjective): whether the content of a test appears to be suited for the measuring aims.  
  • Criterion validity (predictive): whether the results of a test can predict a concrete outcome.
  • Internal validity: whether a cause-and-effect relationship established in a test cannot be explained by other factors.
  • External validity (generalizability): whether the results of a test can be applied to other situations, groups or event.   

(Notes: constructs are concepts or topics in research. Constructs are used as tools to facilitate the explanation of theory components in understanding of human behavior.)  

Construct measurements:

 



Comments

Popular posts from this blog

Resilience, Perseverance and Grit

Person-Environment Fit

A Case Study: The Effect of Academic Behavior on Learning Outcomes