New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Quality of evidence revealing subtle gender biases in science is in the eye of the beholder
Edited by Susan T. Fiske, Princeton University, Princeton, NJ, and approved September 16, 2015 (received for review May 31, 2015)

Significance
Ever-growing empirical evidence documents a gender bias against women and their research—and favoring men—in science, technology, engineering, and mathematics (STEM) fields. Our research examined how receptive the scientific and public communities are to experimental evidence demonstrating this gender bias, which may contribute to women’s underrepresentation within STEM. Results from our three experiments, using general-public and university faculty samples, demonstrated that men evaluate the quality of research unveiling this bias as less meritorious than do women. These findings may inform and fuel self-correction efforts within STEM to reduce gender bias, bolster objectivity and diversity in STEM workforces, and enhance discovery, education, and achievement.
Abstract
Scientists are trained to evaluate and interpret evidence without bias or subjectivity. Thus, growing evidence revealing a gender bias against women—or favoring men—within science, technology, engineering, and mathematics (STEM) settings is provocative and raises questions about the extent to which gender bias may contribute to women’s underrepresentation within STEM fields. To the extent that research illustrating gender bias in STEM is viewed as convincing, the culture of science can begin to address the bias. However, are men and women equally receptive to this type of experimental evidence? This question was tested with three randomized, double-blind experiments—two involving samples from the general public (n = 205 and 303, respectively) and one involving a sample of university STEM and non-STEM faculty (n = 205). In all experiments, participants read an actual journal abstract reporting gender bias in a STEM context (or an altered abstract reporting no gender bias in experiment 3) and evaluated the overall quality of the research. Results across experiments showed that men evaluate the gender-bias research less favorably than women, and, of concern, this gender difference was especially prominent among STEM faculty (experiment 2). These results suggest a relative reluctance among men, especially faculty men within STEM, to accept evidence of gender biases in STEM. This finding is problematic because broadening the participation of underrepresented people in STEM, including women, necessarily requires a widespread willingness (particularly by those in the majority) to acknowledge that bias exists before transformation is possible.
Objectivity is a fundamental value in the practice of science and is required to optimally assess one’s own research findings, others’ findings, and the merits of others’ abilities and ideas (1). For example, when scientists evaluate data collected on a potentially controversial topic (such as climate change), they strive to set aside their own belief systems and instead focus solely on the strength of the data and conclusions warranted. Similarly, when scientists evaluate a resume for a laboratory-manager position or assess the importance of a conference submission, the gender of the applicant or author should be immaterial. If they are truly objective, scientists should focus only on the relevant criteria of applicant qualifications or research merit.
However, despite rigorous training in the objective evaluation of information and resultant values (2), people working and learning within the science, technology, engineering, and mathematics (STEM) community are still prone to the same subtle biases that subvert objectivity and distort accurate perceptions of scientific evidence by the general public (3, 4). We focus here on the robust gender biases documented repeatedly within the psychological literature (5⇓–7). Some within the STEM community have turned to these methods and ideas as an explanation for the consistent underrepresentation of women in STEM fields (8, 9) and the undervaluation of these women and their work. Specifically, many scientists have systemically documented and reported (including in PNAS) a gender bias against women—or favoring men—in STEM contexts (10⇓⇓⇓⇓⇓⇓–17), including hiring decisions for a laboratory-manager position (10) and selection for a mathematical task (11), evaluations of conference abstracts (12), research citations (13), symposia-speaker invitations (14), postdoctoral employment (15), and tenure decisions (16). For example, Moss-Racusin et al. (10) conducted an experiment in which university science professors received the same application for a laboratory-manager position, either associated with a male or female name through random assignment. The results demonstrated that the science professors—regardless of their gender—evaluated the applicant more favorably if the applicant had a man's name compared to a woman's name. These findings mirror past results in which men and women psychology faculty participants evaluated an application from a faculty candidate with a woman’s name less favorably than the identical application with a man’s name (17). As another example, Knobloch-Westerwick et al. (12) found that graduate students evaluate science-related conference abstracts more positively when attributed to a male relative to a female author, particularly in male-gender-typed science fields. These biases are frequently unintentional (18⇓–20), exhibited even by individuals who greatly value fairness and view themselves as objective (21). Indeed, gender biases often result from unconscious processes (22, 23) or manifest so subtly that they escape notice (24).
However unintentional or subtle, systematic gender bias favoring male scientists and their work could significantly hinder scientific progress and communication (12). In fact, the evidence for a gender bias in STEM suggests that our scientific community is not living up to its potential, because homogenous workforces (including the academic workplace) can deplete the creativity, discovery, and satisfaction of workers, faculty, and students (25⇓–27). STEM fields are fairly homogeneously male; at 4-y US colleges, for example, an average of 71% of STEM faculty are men (28). For these reasons, there is a growing call for broadening the participation of women (and other underrepresented groups) in STEM fields. For instance, the National Science Foundation (NSF) promoted inclusiveness as a core value in its 2014–18 strategic plan, continues to fund ADVANCE Institutional Transformation grants to broaden the participation of women faculty in STEM, and has created a directorate charged with broadening the participation of all underrepresented people within STEM. Similarly, the National Institutes of Health called for reducing subtle biases and broadening participation in STEM fields (29) and issued at least three large new requests for proposals to help accomplish this goal (30). Indeed, there are growing numbers of research studies, calls to action, strategic plans, and even resources to systematically document, understand, and hopefully ameliorate gender biases within STEM to create a thriving, diverse, and equitable scientific community (31⇓⇓–34). However, are people generally (e.g., taxpayers, voters, government officials, etc.) and STEM practitioners in particular “buying” the mounting evidence of these gender biases within the STEM community? Currently, to our knowledge, there is no experimental research examining how receptive or biased various individuals within the STEM and public communities are to research demonstrating gender bias that undermines women’s participation within STEM. Thus, to address this question, our experimental research investigates potentially biased evaluations among the general public and STEM practitioners of evidence demonstrating gender biases against women/favoring men within STEM fields.
Of course, to ameliorate gender bias within STEM fields, it is not sufficient to simply herald findings demonstrating that STEM practitioners exhibit these biases. Indeed, there may well be another layer of bias such that men evaluate findings such as those reported by Moss-Racusin et al. (10) and Knobloch-Westerwick et al. (12) less favorably than women. In fact, a recent (nonexperimental) analysis of naturally occurring online comments written by readers of popular press articles covering the research of Moss-Racusin et al. (10) suggests that men were more likely than women to demonstrate negative reactions to experimental evidence of gender bias (35). Further, several lines of theorizing suggest that men may evaluate such research as less meritorious than would women (24, 36⇓⇓⇓⇓⇓–42). Among these theories, Social Identity Theory (36⇓–38) and related perspectives (39) posit that people are motivated to perceive their group favorably and defend that perception against threat, and that people within privileged groups often seek to retain and justify their privileged status (39). Men clearly hold an advantageous position within the sciences, because they represent the vast majority of STEM university faculty at all ranks, earn higher salaries controlling for rank and related factors (43), and on average receive more federal grant funding to support their research than their comparable women colleagues (44, 45). Indeed, growing evidence reveals an often invisible advantage for men, stemming in part from inequities against women in STEM, which can threaten that advantage (10, 12, 46, 47). That is, men might find the results reported by Moss-Racusin et al. (10) threatening, because remedying the gender bias in STEM fields could translate into favoring women over men, especially if one takes a zero-sum-gain perspective (47). Therefore, relative to women, men may devalue such evidence in an unintentional implicit effort (18⇓–20) to retain their status as the majority group in STEM fields. However, some men might perceive research that exposes gender bias in STEM as more threatening than other men. According to Social identity Theory, individuals perceive greater threat toward their group (and defend against it) when they are highly committed to that group (37, 38). Thus, men within STEM fields (e.g., physics professors) may feel more threated by the research of Moss-Racusin et al. (10) than men within non-STEM fields (e.g., English professors), assuming they are more committed to STEM fields and men’s status therein. Thus, men overall relative to women are likely to devalue research demonstrating bias against women in STEM, but this difference may be prominent among individuals within (and committed to) STEM fields, and weaker to nonexistent among individuals within non-STEM fields.
Beyond Social Identity Theory, other frameworks could predict a difference between men’s and women’s evaluation of research demonstrating bias against women in STEM, and, in fact, this difference might result from multiple factors. For instance, the predicted gender difference may also result from a confirmation bias such that people favorably evaluate information that is consistent with their beliefs, but unfavorably evaluate information that is inconsistent with their beliefs (48). A classic empirical example of confirmation bias showed that peer-reviewers were less favorable toward an essentially identical research manuscript when it was doctored to report results inconsistent with the reviewers’ preferred theoretical viewpoint, but more favorable when it was doctored to report results consistent with the reviewers’ preferred theoretical viewpoint (49). Add to this finding that there is compelling evidence that women faculty are more likely to view gender bias as a problem within their current working academic context (40), and it is possible that women may evaluate research demonstrating a gender bias (belief consistent) more favorably than men, but evaluate research demonstrating no gender bias (belief inconsistent) less favorably than men.
Current Research
We report three experiments designed to provide, to our knowledge, the first test for gender differences in the evaluation of scientific evidence demonstrating that individuals are biased against women within STEM contexts. In each experiment, men and women participants read via an online survey instrument an actual article abstract from a peer-reviewed scientific journal, accompanied by the date and title of the publication (see Materials and Methods for more details). Participants then evaluated their agreement with the authors’ interpretation of the results, the importance of the research, and how well-written and favorable they found the quality of the abstract. These ratings were highly associated with one another and were averaged to create a measure of participants’ overall evaluation of the abstract (for further details, see SI Materials and Methods, Dependent Variables). Globally, we predicted that male relative to female participants would evaluate the abstract less favorably when the abstract reported a gender bias against women in STEM (hypothesis A; experiments 1–3), and that this difference would be more prominent among participants in STEM (vs. non-STEM) fields, to whom a gender bias in STEM is most germane (hypothesis B; experiment 2). Further, we predicted that this gender difference would manifest for abstracts that reported a gender bias in STEM, but would reverse for abstracts that reported no gender bias in STEM (hypothesis C; experiment 3).
All experiments included 2 or more factors (some for exploratory purposes in Experiments 1 and 2; see SI Materials and Methods for more details), and thus we tested all hypotheses using between-groups factorial analyses of variance. Further, we calculated Cohen’s d for each experiment to provide an index of strength for the predicted difference between men and women participants and to account for the unequal sample sizes between the genders. As per convention (50), effect sizes can range from small (d = 0.20), to medium (d = 0.50), to large (d = 0.80).
The first two experiments tested for participant-gender differences in the evaluation of the actual abstract written by Moss-Racusin et al. (10). As discussed above, Moss-Racusin et al. (10) produced experimental evidence that STEM faculty of both genders demonstrate a significant bias against an identical applicant with a female vs. male name. Although this gender bias was empirically demonstrated with a national sample, we predicted that men would be less receptive to these (and related) findings, and women more receptive. Our first experiment involved a general sample of US adults (n = 205) recruited online through Amazon’s Mechanical Turk. Our second experiment involved a sample of professors (n = 205) from all STEM and non-STEM departments at a research-intensive university, allowing us to test whether the predicted gender difference in abstract evaluations is larger among individuals within STEM fields of study. A third experiment replicated the first two with a different abstract and is discussed in more detail below.
SI Materials and Methods
Participants and Recruitment for Experiments 1 and 3.
Participation was elicited from workers on Amazon’s Mechanical Turk online job site, who could view our employment opportunity (titled “What do REAL people think about science research results?”) listed alongside other opportunities.
A total of 205 individuals opted to participate in experiment 1 and provided usable data, which was active in March 2014. Originally, 218 individuals participated in the experiment, but 9 were excluded from data analysis because they failed one or more attention-check items (e.g., “If you are reading, respond ‘very much’ to this question;” “If you are reading, respond ‘not at all’ to this question”), 2 because they reported being under 18 y of age, and 2 because they did not specify their gender. Ultimately, 146 men and 59 women from the United States who were 18 y of age or older (M = 30.13; range = 18–66) were retained for analysis. Of this general sample, 68.12% reported their race as “white,” and 51 individuals reported they were currently college students.
A total of 303 individuals opted to participate in experiment 3 and provided usable data, which was active in November 2014. Originally, 321 individuals participated in the experiment, but 12 were excluded from data analysis because they failed one or more attention-check items, 2 because they reported being under 18 y of age or did not specify an age, 1 because they did not specify their gender, and 7 because they reported they had read the abstract before (some participants met multiple exclusion criteria). Ultimately, 162 men and 141 women from the United States who were 18 y of age or older (M = 34.22; range = 18–79),were retained for analysis. Of this general sample, 73.93% reported their race as “white,” and 55 individuals reported they were currently college students.
Participants and Recruitment for Experiment 2.
Participation was initially elicited on November 4, 2013, from all 506 tenure-track faculty at a research-intensive university via an email from their university provost encouraging participation in a larger faculty climate survey. That same day, our research team emailed all tenure-track faculty a message that explained the nature and importance of the survey, contained an informed consent form for faculty to read, explained the compensation faculty would receive for their participation, and contained a link to the survey and experiment, which was hosted on surveymonkey.com. This email included a unique identification code for each person, which preserved respondents’ anonymity and confidentiality, but allowed us to trace the faculty’s home department. In this way, we could determine whether faculty resided in STEM, including Social and Behavioral Sciences, departments (i.e., Agricultural Economics and Economics, Animal and Range Sciences, Cell Biology and Neuroscience, Center for Biofilm Engineering, Chemical and Biological Engineering, Chemistry, Civil Engineering, Computer Science, Earth Science, Ecology, Electrical Engineering, Industrial and Management Engineering, Institute on Ecosystems, Immunology and Infectious Diseases, Land Resources and Environmental Science, Mathematical Sciences, Mechanical and Industrial Engineering, Microbiology, Native American Studies, Physics, Plant Sciences, Political Science, Psychology, and Sociology and Anthropology) or non-STEM departments (i.e., Agricultural Education, Art, Nursing, Education, English, Film and Photography, Health and Human Development, History and Philosophy, Honor's Program, Liberal Studies, Modern Languages and Literature, Science Education, Music, and University Studies). All faculty who did not participate as of November 18 received a reminder email, which also contained a link to the survey and experiment and their unique identification code. The survey was closed on the evening of November 22.
Ultimately, 286 faculty participated in the unrelated survey, and 205 (40.5% of faculty) further elected to participate in our experiment at the end of the survey. Of these, 111 (54%) were men and 94 were women. Further, as specified above, 116 faculty were categorized as residing in STEM departments and 89 as residing in non-STEM departments. A comparable ratio of faculty from STEM (116/289 or 40.1%) and non-STEM (89/217 or 41.0%) departments completed the experiment. Participants indicated their race as white/Caucasian (86.3%), Asian (2%), Hispanic/Latino (1%), Native American (0.5%), or mixed (0.5%), or they opted not to report these data (9.8%). Further, participants reported their faculty rank as assistant (43.9%), associate (27.8%), or full (26.3%), or they did not specify (2%). Participants’ ages ranged from 27 to 73 y (M = 47.35), and they had worked in their current position between 0 and 35 y (M = 10.51). The demographics of our sample closely match the population of professors from this university (which is 64% male and 90.9% white/Caucasian), although assistant professors were somewhat overrepresented in our sample relative to the university population (assistant, associate, and full ranks comprise 29.3%, 32.1%, and 38.6% of professors, respectively). Aside from rank, perhaps, we can reasonably infer that there were no systematic biases influencing individuals’ decisions to participate in the experiment. That is, the results from this sample likely generalize to the population of faculty under investigation.
Procedure for Experiment 1.
For experiment 1, once participants clicked on the title for our experiment on Amazon’s Mechanical Turk, they encountered the following short paragraph: “In the scientific world, peer experts judge the quality of research and decide whether or not to publish it, fund it, or discard it. But what do everyday people think about these articles that get published? We are conducting an academic survey about people's opinions about different types of research that was published back in the last few years. You will be asked to read a very brief research summary and then answer a few questions about your judgments as non-experts about this research. There is no right or wrong answer and we realize you don’t have all the information or background. But just like in the scientific world, many judgments are made on whether something is quality science or not after just reading a short abstract summary. So to create that experience for you, we ask that you just provide your overall reaction as best you can even with the limited information. You will also be asked to provide demographic information about yourself. Select the link below to complete the survey.” Participants were also reminded that they would receive $0.25 in exchange for submitting the job “hit.” Participants then accepted the hit and opened up the survey in a separate tab or window. After consenting to participate, participants were given a summary of the experiment that they read before accepting the hit and then were asked, “Please read the following abstract from a 2012 published research study then provide your opinion with the items below.” Next, participants viewed the abstract written by Moss-Racusin et al. (10), the first author’s name and affiliation, and keywords, as described in the main text, and participants then provided their opinions about the abstract using scale ratings (SI Materials and Methods, Dependent Variables). Once they began the survey, participants learned that they could skip over any questions or task that they wished, ensuring that our procedures were not coercive. Participants then completed demographic information, were debriefed regarding the purpose of the experiment, and were compensated $0.25 for their time.
Procedure for Experiment 2.
For experiment 2, once participants followed the link to the survey website, they first read information about the faculty climate survey and the types of tasks and questions they would encounter. Participants were also reminded that they would receive a $5 coupon from a local coffee shop for completing the survey and would be entered into a raffle to win 1 of 50 gift certificates form the campus bookstore (worth $50). Once they advanced to the survey, participants further learned that they could skip over any question or task they wished. This option resulted in several participants providing only partial data for the experiment (addressed in SI Additional Analyses, Experiment 2). The faculty climate survey took ∼15 min to complete and primarily contained questions about the university work environment, which were independent from the reported experiment.
Just after the survey, participants were asked to “Please read the following abstract from a 2012 published research study then provide your opinion with the items below.” They then viewed the same abstract and associated information as in experiment 1 and evaluated that abstract using the same scale ratings. Finally, participants entered their unique code and could print off a coupon in compensation for their participation.
Procedure for Experiment 3.
The procedures for experiment 3 were identical to experiment 1, with a few minor, but important, differences. First, participants were randomly assigned to read either the original version of the Knobloch-Westerwick et al. (12) abstract, which reported a gender bias (e.g., “Publications from male authors were associated with greater scientific quality, in particular if the topic was male-typed”) or a version slightly altered to report no gender differences (e.g., “Publications from male and female authors were associated with comparable scientific quality, even if the topic was male-typed”). Second, unlike in experiments 1 and 2, the abstract was not accompanied by the author’s name or affiliation. Otherwise, the procedures and dependent measures for this experiment were identical to those used in the previous experiments. At the end of the experiment, participants completed demographic information, were debriefed regarding the purpose of the experiment, and were compensated $0.25 for their time.
Dependent Variables.
After reading the abstract, participants in all experiments reported their evaluation of the abstract and research using measures adapted from those commonly used to gauge attitude change and evaluations of persuasive materials (59, 60). Specifically, on scales from 1 (not at all) to 6 (very much), participants responded to the following four questions or statements: “To what extent do you agree with the interpretation of the research results?” “To what extent are the findings of this research important?” “To what extent was the abstract well written?” and “Overall, my evaluation of this abstract is favorable.” These four responses demonstrated high internal consistency in all experiments (Cronbach’s α = 0.84, 0.89, and 0.78 in experiments 1, 2, and 3, respectively) and were therefore averaged to measure participants’ perceived quality of the research.
For experiment 2 only, participants completed a faculty climate survey before the experiment, which included items assessing the extent to which faculty felt that they had been personally discriminated against due to their gender. Specifically, on scales from 1 (strongly disagree) to 7 (strongly agree), participants responded to the following three statements: “I have personally been a victim of gender discrimination,” “I consider myself a person who has been deprived of opportunities because of my gender,” and “Prejudice against my gender group has not affected me personally” (the latter of which was reverse-scored). These three responses demonstrated high internal consistency (Cronbach’s α = 0.87) and were therefore averaged to measure participants’ personal experience of gender discrimination.
Results
Experiments 1 and 2.
Results from our experiment 1 supported hypothesis A, revealing a main effect of participant gender [F(1, 197) = 9.85, P = 0.002, η2partial = 0.048], such that men (M = 4.25, SD = 0.91, n = 146) evaluated the research less favorably than women (M = 4.66, SD = 0.93, n = 59) in a general sample. Further, this effect was of moderate size (d = 0.45).
Results from our experiment 2 also supported hypothesis A, revealing a main effect of participant gender [F(1, 174) = 6.08, P = 0.015, η2partial = 0.034], such that male faculty evaluated the research less favorably (M = 4.21, SD = 1.05) than female faculty (M = 4.65, SD = 1.19; d = 0.397 [similar to experiment 1]). Thus, overall, experiments 1 and 2 provide converging evidence from multiple participant populations that men are less receptive than women—and by the same token, that women are more receptive than men—to experimental evidence of gender bias in STEM. Importantly, results from experiment 2 further reveal that this effect was qualified by a significant interaction between participant gender and field of study [F(1, 174) = 5.19, P = 0.024, η2partial = 0.03]. This interaction supported hypothesis B, because simple-effect tests confirmed that male faculty evaluated the research less favorably (M = 4.02, SD = 0.988, n = 66) than female faculty (M = 4.80, SD = 1.14, n = 38) in STEM fields [F(1, 174) = 11.94, P < 0.001], whereas male (M = 4.55, SD = 1.09, n = 37) and female (M = 4.54, SD = 1.23, n = 49) faculty reported comparable evaluations in non-STEM fields (F < 1). Further, the effect size for the observed gender difference was large within STEM departments (d = 0.74). Looking at this interaction another way, simple-effect tests demonstrated that men evaluated the research more negatively if they were in STEM than non-STEM departments [F(1, 174) = 4.19, P = 0.042], whereas the opposite trend was not statistically significant among female faculty [F(1, 174) = 1.45, P = 0.23]. Thus, it seems that men in STEM displayed harsher judgments of Moss-Racusin et al.’s (10) research, not that women in STEM exhibited more positive evaluations of it. The analysis revealed one other significant interaction that did not involve faculty gender (for further details, see SI Additional Analyses, Experiment 2). No other main effects or interactions reached significance (all other F < 2.07; P > 0.15). Finally, additional measures collected within a faculty survey (SI Materials and Methods, Dependent Variables) and analyses thereof provide suggestive evidence for a threat mechanism behind the effects (for the analyses and discussion, see SI Additional Analyses, Experiment 2).
Experiment 3.
We predicted that, compared with women, men would be prone to more negative evaluations of research that demonstrates a gender bias against women (and favors men) in STEM, not just the specific research reported by Moss-Racusin et al. (10). Further, we predicted that, compared with men, women would be prone to more negative evaluations of research that demonstrates no gender bias against women in STEM. Thus, the gender effect seen in experiments 1 and 2 should replicate for a different abstract that also reports a gender bias, but reverse for an abstract that demonstrates no gender bias. Testing these predictions, we randomly assigned new participants to read either the original abstract published by Knobloch-Westerwick et al. (12) which reported a gender bias against women’s (relative to men’s) scientific conference submissions, or a version slightly altered to report no gender bias. These participants were recruited online through Amazon’s Mechanical Turk (n = 303). Results indicated only a significant interaction between participant gender and abstract version [F(1, 299) = 4.00, P = 0.046, η2partial = 0.013] (all other F < 1). Although no simple-effect tests were significant (all F < 2.69, P > 0.10), together, these results support the overall pattern predicted by hypothesis C, such that that men evaluated the original (gender-bias exists) abstract less favorably (M = 3.65, SD = 1.03, n = 78) than did women (M = 3.86, SD = 1.05, n = 74; d = 0.20), whereas men evaluated the modified (no gender-bias exists) abstract more favorably (M = 3.83, SD = 0.92, n = 84) than did women (M = 3.59, SD = 0.86, n = 67; d = 0.27).
SI Additional Analyses
Experiment 1.
For the primary measure, author gender and affiliation alone did not influence evaluations, and neither did any two-way interactions among factors (all P > 0.3). However, the analysis revealed a nonpredicted and significant interaction among participant gender, author gender, and author affiliation [F(1, 197) = 18.13; P < 0.001]. Consistent with the theme of this work, we describe this interaction in terms of gender differences at each combination of author gender and affiliation. When the abstract author was supposedly a man from Iowa State University, male participants rated the abstract as being of higher quality (M = 4.57, SD = 0.787) than did women (M = 4.26, SD = 0.893), whereas when the abstract author was supposedly a woman from Iowa State University, female participants rated the abstract as being of higher quality (M = 5.03, SD = 0.713) than did men (M = 3.89, SD = 1.13). Thus, when the author was supposedly affiliated with Iowa State University, all participants seemed to demonstrate a gender bias in favor of their own gender; women had higher ratings for a female author, and men gave higher ratings for a male author. However, when the abstract author was supposedly a man from Yale University, female participants instead rated the abstract as being of higher quality (M = 5.02, SD = 0.874) than did men (M = 4.13, SD = 0.897), whereas when the abstract author was supposedly a woman from Yale University, female participants reported ratings of the abstract (M = 4.38, SD = 1.031) that were equivalent to those of men (M = 4.38, SD = 0.697). Interestingly, when evaluating research from Yale that reveals gender bias, it seems that women demonstrated the greatest bias against women (or favoring men) authors.
There are at least two important notes regarding this interaction between participant gender, author gender, and author affiliation. First, this interaction was not observed in the second experiment among university faculty. Thus, although this interaction is certainly interesting, we withhold focusing too much on this result until it is replicated in future research. This result was not predicted or replicated and may be spurious. Second, if this interaction pattern does replicate in future research, this finding may indicate that the lay public and scientific community manifest bias toward research uncovering gender bias differently under different conditions. Within scientific communities, perhaps the gender bias against such research is unaffected by author gender or affiliation. However, in the lay public, the gender bias is more complex and context-dependent. Ultimately, it is important to understand failures in objectivity among the scientific community, as well as the public, regarding research demonstrating gender bias in STEM. After all, it is often the nonscientists (the public, government officials, bureaucrats, nonprofit organizations, special-interest groups, etc.) that drive the funding opportunities so critical to scientific progress and discovery.
Experiment 2.
In addition to the predicted effects reported in the paper, the primary analysis also revealed a significant interaction among field of study, author gender, and author affiliation [F(1, 174) = 8.07; P < 0.01]. The interaction pattern indicated that faculty in STEM evaluated the abstract written by a man more favorably if the author was from Yale (vs. Iowa State), but the abstract written by a woman more favorably if the author was from Iowa State (vs. Yale), whereas the opposite pattern manifested among non-STEM faculty.
Additionally, we conducted the analysis again, removing fields of study associated with the social and behavioral sciences (i.e., Agricultural Economics and Economics, Native American Studies, Political Science, Psychology, and Sociology and Anthropology) from the analysis entirely. Given that the classification of some of these fields as STEM might vary depending on who one consults, we wanted to confirm that the key results held comparing STEM to non-STEM fields, even excluding the social and behavioral sciences. Indeed, this analysis, too, revealed the predicted significant main effect of gender [F(1, 156) = 8.30, P = 0.005] and the predicted significant interaction between gender and field of study [F(1, 156) = 7.31, P = 0.008].
Further, given that there was a somewhat disproportionate representation of assistant professors in our sample, we investigated whether our results held accounting for faculty rank. To do this analysis, we collapsed across the author’s gender and affiliation (including all factors created several conditions with only one participant’s response) and conducted an analysis with faculty gender, field of study, and faculty rank as factors (four participants did not report their rank and were therefore not included in this analysis). Like the primary analysis, this analysis revealed a significant main effect of gender [F(1, 174) = 6.04; P = 0.015] and a significant interaction between gender and field of study [F(1, 174) = 5.27; P = 0.023]. Therefore, the original results hold while controlling for faculty rank. No other main effects or interactions reached significance (all other F < 2.43; P > 0.09).
Of note, several participants in experiment 2 elected to skip some of our four measures. Of the full 205 participants, 190 completed all four measures—which were averaged for the primary analyses. Thus, we examined how well our predicted findings held examining each measure independently. Critically, there was a significant main effect of participant gender for three of the four measures. Relative to female faculty, male faculty agreed less with the interpretations of the research [n = 199, F(1, 183) = 6.66, P = 0.011], evaluated the research findings as less important [n = 202, F(1, 186) = 7.00, P = 0.009], evaluated the abstract as less well written [n = 196, F(1, 181) = 4.67, P = 0.032], and overall evaluated the abstract less favorably [n = 201, F(1, 185) = 3.45, P = 0.065)].
Additionally, the pattern of means for the interaction between participant gender and their STEM status for each of these measures was identical to that observed for the primary analysis. However, the omnibus test of this interaction was significant for participants’ ratings of how important they evaluated the research findings [F(1, 186) = 8.31, P = 0.004], how well written they found the abstract [F(1, 181) = 4.22, P = 0.041], and their overall favorability toward the abstract [F(1, 185) = 9.80, P = 0.002], but not for their assessment of how much they agreed with the interpretations of the research [F(1, 183) = 1.55, P = 0.21]. Nonetheless, as in the primary analysis, simple-effect tests for all measures revealed that male faculty reported less favorable evaluations than female faculty in STEM departments (all F > 7.91 and < 17.14; all P < 0.005), but comparable evaluations within non-STEM departments (all F < 1). Overall, then, the critical findings for the primary measure hold well when looking at each individual measure.
Finally, although we did not design experiment 2 to specifically investigate potential mechanisms behind these effects, especially regarding the interaction, some data within a faculty survey (completed just before our experiment) allowed us to explore the possibility that these effects were related to perceptions of threat. Specifically, faculty rated the extent to which they felt they had been personally discriminated against due to their gender (SI Materials and Methods, Dependent Variables). We reasoned that the greater men’s experience of gender discrimination (the more they feel women have had an unjust advantage at men’s expense), the more threatening they should find research demonstrating an actual bias against women in STEM. After all, men who have experienced gender discrimination may harbor concern that such research could promote future “reverse” discrimination against men in STEM. Further, assuming men in STEM are more committed to (or identify with) STEM than men in non-STEM fields, Social Identity Theory (36, 37) predicts that the experience of threat should predominantly manifest among men in STEM. Indeed, there was a negative correlation between the personal experience of gender discrimination and evaluations of the abstract only among men in STEM. The more male faculty in STEM felt they experienced gender discrimination, the less favorably they evaluated the abstract [r(63) = −0.404; P = 0.001]. This same correlation among non-STEM men was positive but nonsignificant, [r(34) = 0.157; P = 0.367]. Among women, results yielded a significant correlation within non-STEM fields [r (48) = 0.35; P = 0.014], but no correlation within STEM fields [r (36) = 0.262; P = 0.118]. However, these correlations would not indicate anything about threat because the results of Moss-Racusin et al. (10) affirm women’s experience with gender discrimination.
Together, these two correlations among men in STEM and non-STEM are consistent with Social Identity Theory and our assumption that men in STEM identify more with STEM than do non-STEM men and likely perceived the abstract as more threatening. However, the gender-discrimination measure did not mediate the effects found for the abstract evaluation. To test for possible effects, we subjected the gender-discrimination measure to an analysis of variance with gender and field of study as factors (participants completed this measure before reading the abstract, making the factors associated with the abstract inconsequential). Importantly, this analysis revealed a significant main effect of gender such that women experienced greater gender discrimination than men [F(1, 194) = 16.87; P < 0.001], indicating that the construct was valid. However, this analysis revealed no interaction [F(1, 194) = 1.77; P > 0.18], meaning this construct did not mediate our primary results. This finding is not necessarily surprising, however, given that the gender-discrimination measure was not designed to directly measure the extent to which participants find the results of Moss-Racusin et al. (10) to be threatening. Overall, then, the correlation evidence is only suggestive, and we encourage future research to explore this and other possible mechanisms behind our effect.
Discussion
There is now copious evidence that women are disadvantaged in STEM fields (51⇓–53) and that this disadvantage may relate to gender stereotypes (11) and consequent biases against women (or favoring men) traversing the STEM pipeline (10⇓⇓⇓⇓⇓⇓–17). Of course, people should not passively accept such evidence, even if it appears in preeminent peer-reviewed journals (e.g., Science, PNAS, or Nature)—suggesting the quality of the research was sound. Ideally, especially within the STEM community, people should evaluate as objectively as possible the research producing such evidence, the resulting quality of the evidence, and the interpretation of that evidence.
However, the evidence from our three straightforward experiments indicates than men evaluate research that demonstrates bias against women in STEM less favorably than do women—or, that women evaluate it more favorably. Specifically, male relative to female participants (including university faculty) in experiments 1 and 2 assessed the quality of the research by Moss-Racusin et al. (10)—as presented simply through their actual abstract—as being lower. In addition, perhaps of greatest concern, this gender difference and accompanying effect size was large among faculty working within STEM fields (50) and nonexistent among faculty from non-STEM fields (experiment 2). Further, the overall gender difference observed in the first two experiments was replicated among participants in experiment 3 who read the true abstract of Knobloch-Westerwick et al. (12), which also reported a gender bias in STEM. However, this gender difference was reversed among participants who read an altered version purporting no gender bias in STEM.
The results from this third experiment are important for at least three reasons. First, they indicate that men relative to women do not uniquely disfavor the research of Moss-Racusin et al. (10), but research that reports a gender bias hindering women in STEM. Second, these results suggest that men do not generally evaluate research more harshly than women, as it might seem from the first two experiments (but see the results from non-STEM faculty in experiment 2). Rather, relative to women, men actually favor research suggesting there is no gender bias in STEM. Finally, the results indicate that individuals are likely to demonstrate a gender bias toward research pertaining to the mere topic of gender bias in STEM; men seem to disfavor (and women favor) research demonstrating a gender bias, but women seem to disfavor (and men favor) research demonstrating no gender bias. Of course, given that we cannot have a gender-free control condition, it is important to note that these biases are relative to the other gender; we cannot conclude that one gender is more biased than the other, just that individuals’ judgments of research regarding gender bias in STEM is biased by their gender.
Critically, across three experiments, we uncovered a gender difference in the way people from the general public and STEM faculty evaluate the quality of research that demonstrates women’s documented disadvantage in STEM fields: Men think the research is of lower quality, whereas women think the research is of higher quality. Why does this gender difference matter? For one, there are significant implications for the dissemination and impact of meritorious previous, current, and future research on gender bias in STEM fields. Foremost, our research suggests that men will relatively disfavor—and women will relatively favor—research demonstrating this bias. Given that men dominate STEM fields throughout industry and academia, scholars whose program of research focuses on demonstrating gender bias in STEM settings might experience undue challenges for publication, have lower chances of publication in top-tier outlets, experience greater challenges in receiving tenure, and overall have lower-than-warranted impact on the thinking, research, and practice of those in STEM fields. Such possibilities are highly problematic and call for additional research evaluating biased reactions to scientific evidence demonstrating gender and/or racial biases within STEM.
Second, because men represent the majority of individuals in STEM fields and yet are less likely than women to acknowledge biases against women in STEM, it may be challenging to fully embrace the numerous calls to broaden the participation of women and minorities in STEM. How can we successfully broaden the participation of women in STEM when the very research underscoring the need for this initiative is less valued by the majority group who dominate and maintain the culture of STEM? Intensifying the challenge, men hold an advantageous position in STEM fields and may feel threatened by research and efforts to “level the playing field” for women. Similarly, people often unintentionally exhibit in-group favoritism (54), wherein individuals engage in behaviors and allocate resources in ways that benefit members of their group (e.g., men unintentionally conferring advantage to other men).
Fortunately, there are current efforts in place to meet these challenges. For example, “Project Implicit” (https://implicit.harvard.edu/implicit/) provides workshops and talks to reveal the subtlety and implicitness of gender bias and considers how to foster a broader recognition of these biases and address them. Further, NSF funds ADVANCE-Institutional Transformation grants to specifically facilitate the increased participation of women in STEM and help transform academic cultures to foster equality and inclusivity. Shields et al. (55) created a “WAGES” game and accompanying discussion platform that effectively highlights male privilege and advantage among STEM faculty and helps reduce reactance to acknowledging this advantage (56). Finally, Moss-Racusin et al. have developed an evidence-based framework for creating, evaluating, and implementing diversity interventions designed to increase awareness of and reduce bias across STEM fields (31). Initial evidence reveals promising results for interventions adhering to these guidelines (31). These efforts, along with others that can help individuals actually acknowledge evidence demonstrating gender bias in STEM, are critical in bringing about change and increasing the participation of women in STEM.
Limitations and Future Directions
As with any research, ours is met with limitations. First, we did not directly test the potential mechanisms behind the reported gender effect. However, even before we understand exactly why men are less favorable than women toward research demonstrating a gender bias in STEM, we suggest that is important for the STEM community to know that this phenomenon exists. However, we uncovered evidence in experiment 2 suggesting that men in STEM found the abstract of Moss-Racusin et al. (10) threatening (SI Additional Analyses, Experiment 2), which may be one possible explanation for the results (37). In the future, researchers could test this possibility by including a direct measure of how threatening people find the implications of various research results and multiple measures of social identity. It is also worth investigating in future research whether the confirmation bias (48, 49) contributes to the reported gender effect by measuring people’s beliefs about gender bias in STEM before reading research demonstrating that bias. We hope our findings will spark future research thoroughly investigating the mechanisms underscoring this effect. Second, we investigated individuals’ evaluations of two abstracts reporting gender bias in STEM, specifically within the contexts of evaluating a laboratory-manager application and conference abstracts. It is worthwhile to investigate whether this bias furthermore generalizes to evaluations of research that demonstrates gender bias in other STEM contexts, such as disparities in funding, publication rates, faculty and postdoctoral applicants, talk invitations, tenure decisions, and so forth. Theoretically, however, there is reason to predict that gender biases toward such research would replicate our current findings. In fact, because these contexts suggest a bias against (or in favor of) one’s direct peers and colleagues, it seems likely that gender-biased evaluations of this research would be even more prominent. For instance, STEM faculty might find threatening the possibility that they are biased regarding the quality of research from their female colleagues and prefer (likely implicitly) to find fault with the research rather than face that possibility.
Third, we investigated individuals’ assessment of research quality after they read only an abstract. We chose an abstract as a reasonable basis for assessment because abstracts present key methods and findings, are indexed and available for free, and are often what people read to determine whether or not they will read the full article. Nonetheless, it is conceivable that the gender bias we uncovered is a short-lived reaction. Perhaps the bias would shrink or disappear after reading the full article or a longer synopsis of the research. However, there is ample reason to predict that the bias will actually strengthen as people receive greater amounts of information, because they will (unintentionally) process that information based on initial impressions and per their motivation to arrive at a particular conclusion (42, 48, 49). However, we encourage future research into this issue.
As a final point on limitations, our experiments took place on an Internet platform, either at the end of a faculty survey that offered US$5 or as a short 10-min experiment paying $0.25. Thus, it is possible that our participants were not highly motivated to think about the abstract and thus simply based their quality assessments on “gut reactions” resulting in part from unconscious biases. Perhaps our findings would not hold among highly motivated participants whose assessments might have actual bearing on the publication of the research described in the abstract (e.g., peer reviewers). This hypothesis is certainly a possibility that warrants future exploration. However, we note that greater motivation does not always result in greater objectivity. In fact, biases can influence people’s judgments even more so when they are motivated to be accurate, particularly if they do not notice that their thought process is biased (21, 42).
Further research might also explore why our first two experiments did not replicate previous research demonstrating an overall bias favoring the research of men above women in STEM (SI Additional Analyses). In particular, Knobloch-Westerwick et al. (12) found that graduate students evaluate science-related conference abstracts more positively when attributed to a male (relative to female) author, particularly in male-gender-typed fields. However, we did not find that participants in experiment 1 and 2 favored the abstract written by Moss-Racusin et al. (10) more if they thought it was written by a man vs. a women. It is possible that participants in our first two experiments found the topic of gender bias within STEM “feminine,” or perhaps only somewhat “scientific,” thus decreasing the bias toward the author’s gender. Future research might reveal that participants’ perception of gender-bias research plays an important role in producing biases against women—and favoring men—who conduct such research.
Conclusion
Failures in objectivity are problematic to specific research projects, science generally, and receptivity to discovery. However, objectivity is threatened by a multitude of cognitive biases, including gender bias in STEM fields. Numerous experimental findings confirm the existence of this bias, and the research we present here peels back yet another level of bias: Men evaluate the research that confirms gender bias within STEM contexts as less meritorious than do women. We hope that our findings help inform and fuel self-correction efforts within STEM to reduce this bias, bolster objectivity, and diversify STEM workforces. After all, the success of these efforts can translate into greater STEM discovery, education, and achievement (57).
Materials and Methods
Participants.
In experiments 1 and 3, participation was solicited from workers on Amazon’s Mechanical Turk online job site, who could view our employment opportunity listed alongside other opportunities. In experiment 1, a total of 205 individuals (146 men and 59 women) from the United States who were 18 y of age or older (M = 30.13; range = 18–66) opted to participate in the experiment and provided usable data (for more details, see SI Materials and Methods, Participants and Recruitment for Experiments 1 and 3). In experiment 3, a total of 303 individuals (162 men and 141 women) from the United States who were 18 y of age or older (M = 34.22; range = 18–79) opted to participate in the experiment and provided usable data. All participants engaged in the ∼10-min experiment in exchange for $0.25.
In experiment 2, participation was first solicited from all tenure-track faculty at a research-intensive American university via an email from their university provost encouraging participation in a larger baseline faculty climate survey. The survey and experiment were conducted on an Internet platform, during which time 506 tenure-track faculty from this university received the email invitation to participate. A total of 268 of these faculty participated in the survey, and 205 of these faculty further elected to participate in our experiment at the end of the survey. The resulting sample included faculty from all departments at the university, from STEM departments (n = 116) and non-STEM departments (n = 89; for more details, see SI Materials and Methods, Participants and Recruitment for Experiment 2). All participants received a $5 coupon for a local coffee shop and, if they elected, were entered into a raffle for 1 of 50 possible $50US gift certificates for the campus bookstore.
Procedure.
All procedures were approved by the Montana State University institutional review board. The three experiments were approximately identical, although the experiment stood alone in experiments 1 and 3 and followed a faculty climate survey in experiment 2. All participants completed the experiment using a personal or work computer and received experiment materials, provided informed consent, and provided responses through surveymonkey.com.
Participants in experiments 1 and 2 were first instructed to read the actual abstract from the Moss-Racusin et al. (10) paper, which was provided in full on a single screen. The abstract was accompanied by that paper’s actual title, publication date, volume and issue number, first author’s full name, keywords, and a fictitious DOI. Further, participants were randomly assigned to receive a version of the abstract that either identified the first authors’ first name as “Karen” or “Brian,” which previous research indicates are equally likable and common names in the United States (58). Independent from this manipulation, participants received a version of the abstract that identified the author as affiliated with either Yale University (Moss-Racusin’s true affiliation at the time of the publication) or Iowa State University. After reading the abstract and affiliated information, participants were asked to provide ratings on several scales (adapted from scales commonly used to gauge attitude change and evaluations of persuasive materials) assessing the quality of the abstract and the research provided therein (for details, see SI Materials and Methods, Dependent Variables). Participants also provided demographic information, including their gender. Participants’ responses were anonymous, but in experiment 2 their status as a STEM or non-STEM faculty member was identifiable using specialized codes. Overall, the research design allowed us to analyze participants’ quality assessments of the Moss-Racusin et al. (10) research as a function of participant gender, author gender, author affiliation, and participants’ STEM affiliation (experiment 2 only).
Participants in experiment 3 completed a similar procedure, with some key differences. First, participants were randomly assigned to read either the original version of the abstract by Knobloch-Westerwick et al. (12), which reported a gender bias, or a version slightly altered to report no gender differences. Second, the abstract was not accompanied by the author’s name or affiliation (as was done in experiments 1 and 2). Otherwise, the procedures and dependent measures for this experiment are identical to those used in the previous experiments. This research design allowed us to analyze participants’ quality assessments of the research by Knobloch-Westerwick et al. (10) as a function of participant gender and abstract version (reporting gender bias or no gender bias).
Acknowledgments
We thank the social science research team (especially Rebecca Belou) and project management staff of ADVANCE Project TRACS (Transformation through Relatedness, Autonomy, and Competence Support) for their efforts. This work was supported in part by National Science Foundation Grant 1208831 (to J.L.S.).
Footnotes
- ↵1To whom correspondence should be addressed. Email: ihandley{at}montana.edu.
Author contributions: I.M.H., E.R.B., C.A.M.-R., and J.L.S. designed research; E.R.B. and J.L.S. performed research; I.M.H., E.R.B., and J.L.S. analyzed data; and I.M.H., E.R.B., C.A.M.-R., and J.L.S. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1510649112/-/DCSupplemental.
Freely available online through the PNAS open access option.
References
- ↵.
- Institute of Medicine, National Academy of Science, and National Academy of Engineering
- ↵
- ↵
- ↵.
- Gilovich T,
- Griffin D,
- Kahneman D
- ↵
- ↵.
- Ceci SJ,
- Williams WM
- ↵
- ↵.
- National Science Board
- ↵.
- Snyder TD,
- Dillow SA,
- Hoffman CM
- ↵.
- Moss-Racusin CA,
- Dovidio JF,
- Brescoll VL,
- Graham MJ,
- Handelsman J
- ↵.
- Reuben E,
- Sapienza P,
- Zingales L
- ↵.
- Knobloch-Westerwick S,
- Glynn CJ,
- Huge M
- ↵
- ↵
- ↵.
- Sheltzer JM,
- Smith JC
- ↵.
- Jaschik S
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵.
- Baron AS,
- Banaji MR
- ↵
- ↵
- ↵.
- Apfelbaum EP,
- Phillips KW,
- Richeson JA
- ↵.
- Page SE
- ↵.
- Freeman RB,
- Huang W
- ↵.
- National Science Board
- ↵.
- Tabak LA,
- Collins FS
- ↵.
- National Institutes of Health
- ↵.
- Moss-Racusin CA, et al.
- ↵.
- US Department of Education
- ↵.
- Obama BH
- ↵.
- President’s Council of Advisors on Science and Technology
- ↵
- ↵.
- Worchel S,
- Austin WG
- Tajfel H,
- Turner JC
- ↵
- ↵
- ↵
- ↵.
- Ecklund EH,
- Lincoln AE,
- Tansey C
- ↵.
- Festinger L
- ↵
- ↵.
- Curtis JW
- ↵
- ↵
- ↵.
- McIntosh P
- ↵.
- Norton MI,
- Sommers SR
- ↵
- ↵
- ↵.
- Cohen J
- ↵.
- American Association of University Women
- ↵.
- Ginther DK,
- Kahn S
- ↵.
- National Science Foundation
- ↵
- ↵
- ↵
- ↵.
- Hong L,
- Page SE
- ↵
- ↵
- ↵
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
- Social Sciences
- Psychological and Cognitive Sciences