Examining the validity of different assessment modes in measuring competence in performing human services.
Darmawan, I Gusti Ngurah
Keeves, John Philip
MetadataShow full item record
This article addresses an important problem that faces educators in assessing students' competence levels in learned tasks. Data from 165 students from Massachusetts and Minnesota in the United States are used to examine the validity of five assessment modes (multiple choice test, scenario, portfolio, self-assessment and supervisor rating) in measuring competence in performance of 12 human service skills. The data are examined using two analytical theories, item response theory (IRT) and generalizability theory (GT), in addition a prior, but largely unprofitable examination using classical test theory (CTT) was undertaken. Under the IRT approach with Rasch scaling procedures, the results show that the scores obtained using the five assessment modes can be measured on a single underlying scale, but there is better fit of the model to the data if five scales (corresponding to the five assessment modes) are employed. In addition, under Rasch scaling procedures, the results show that, in general, the correlations between the scores of the assessment modes vary from small to very strong (0.11 to 0.80). However, based on the GT approach and hierarchical linear modelling (HLM) analytical procedures, the results show that the correlations between scores from the five assessment modes are consistently strong to very strong (0.53 to 0.95). It is argued that the correlations obtained with the GT approach provide a better picture of the relationships between the assessment modes when compared to the correlations obtained under the IRT approach because the former are computed taking into consideration the operational design of the study. Results from both the IRT and GT approaches show that the mean values of scores from supervisors are considerably higher than the mean values of scores from the other four assessments, which indicate that supervisors tend to be more generous in rating the skills of their students. [Author abstract]