FAQ - I have heard that multiple-choice questions (MCQs) are biased in favour of males. What is the evidence for this?

Answer

The largest controlled study looking at the topic of MCQs and gender bias was conducted by the Educational Testing Service (Cole, 1997) over four years with 400 different tests and more than 1500 data sets involving millions of students ranging in age from Grade 4 (9 years old) to graduate school, including those taking the MCAT (Medical College Admissions Test). A clear finding was that asking students to produce the answer rather than select the answer (constructed response versus multiple choice format) did not produce different gender effects when the same question was asked in different formats. (Other studies with the same results include De Mars, 1998; Hamilton and Snow, 1998; Ryan and Fan, 1996.) However, in the constructed answer questions, answers requiring written reponses favoured females and ones requiring the production of a figure or the interpretation of graphical information favoured males. This kind of content effect may be of more importance than the question format.

Other studies also indicate that it is primarily the content and not the format that shows a gender effect. For instance, a study (Ryan and Fan) of 6000 14 year old students on an international mathematics test showed that males scored better on algebra and geometry, but females performed better on arithmetic, though all questions were presented in MCQ format. A similar result from the ETS study found females performing better on computation and males on conceptual questions (though the differences were extremely small in real terms).

A study (Hamilton and Snow) of tests in biology and astronomy taken by 1100 17 year olds showed no format effect when comparing constructed format and MCQs, but males scored better on questions involving spatial or visual content. Interestingly, a short lesson on spatial-mechanical reasoning eliminated the effect. This could indicate a difference related to experience rather than an inherent gender traits. This is also suggested by a study by Byrnes et al. (1997) in which maths items that showed a large gender difference in American students showed no such difference in Chinese students.

Some studies have found gender differences with multiple-choice questions favouring males, the most thorough of which is Breland (1991). However, this study looked at the Advanced Placement Tests in History in the US, a test that is taken by students in advanced courses in the subject who are, in general, high ability students. This presents another possible complicating factor.

A consistent finding in the ETS study is that scores in representative samples of the general population invariably show larger spread for males than females - more high scorers and more low scorers. If stable differences are indeed seen in vet, medical or dental students, this could be a factor, as these populations are likely to be self-selecting for higher ability students. If the range of scores is greater for males in general, then any high ability sample is likely to magnify a male performance advantage because of the higher proportion of high scorers relative to the female sample. This does not indicate a biased test necessarily, but reflects a restricted sample from the statistical distribution of scores in the general population. (See Cole, 1994, for a more thorough discussion.)

Conclusions: Current evidence at most suggest a weak systematic gender bias for MCQs, though there are potential complicating factors. For instance, there is some indication that high ability students, which includes most medical, dental and veterinary students, show greater gender effects with males scoring slightly higher on MCQs and females on constructed response items. There is also evidence that 'well-known' gender differences, e.g. in maths and science, are significantly smaller than 30 years ago (Cole). Again, changes in educational experiences may be decreasing previously observed differences. Also, it is critical that questions, in any format, are well-written, reliable and actually measure what is intended. (See NBME handbook by Case and Swanson for very useful guidelines on writing many types of questions.)

Caveat: Many of the studies cited here were conducted on high school students in the USA. The oldest students were of approximately the same age as first year students in UK universities - 17-18 years old - and so the results are at least somewhat likely to be applicable. So far, I have found very few (0) published studies of gender differences in test items involving UK medical, dental or vet students. A more complete review of this topic is underway. Please let me know (jean@ltsn-01.ac.uk) of studies that are not included here, particularly of medical, dental or vet students, ideally in the UK.

References:

Breland, H. M. (1991). A Study of Gender and Performance on Advanced Placement History Examinations. College Board Report No. 91-4. 44p.

Byrnes, J.P., Hong, L. and Xing, S. (1997). Gender differences on the math subtest of the Scholastic Aptitude Test may be culture-specific, Educational Studies in Mathematics, v34, n1, pp. 49-66.

Case, S. M. and Swanson, D. B. (2001). Constructing written test questions for the basic and clinical sciences: Third Edition. National Board of Medical Examiners. Available at: http://www.nbme.org/nbme/itemwriting.htm

Cole, N. (1997). The ETS Gender Study: How Females and Males Perform in Educational Settings. ETS Technical Report, Available at: ftp://etsis1.ets.org/pub/res/gender.pdf

DeMars, C.E. (1998). Gender differences in mathematics and science on a high school proficiency exam: the role of response format, Applied Measurement in Education, 11(3), pp. 279-99.

Hamilton, L. S. and Snow, R. E. (1998). Exploring differential item functioning on science achievement tests, CRESST Report 483.

Ryan, K.E. and Fan, M. (1996). Examining gender DIF on a multiple-choice test of mathematics: a confirmatory approach. Educational Measurement: Issues and Practices, 15(4), p. 15-20.

 

Disclaimer: This FAQ was originally written by Dr Jean McKendree, and amended by Christopher Smith and does not reflect an official endorsement by the HEA or any other organisation.  Any questions of queries should be sent to enquiries@medev.ac.uk

Last updated: 04 July 2011

 
 
MEDEV, School of Medical Sciences Education Development,
Faculty of Medical Sciences, Newcastle University, NE2 4HH

|