Student ratings of women and men in the classroom: a match-up hypothesis perspective.
Abstract: The purpose of this study was to examine the influence of teacher sex and student sex on perceptions of fit and teacher effectiveness. Students (43 women, 55 men) enrolled at a large public university in the Southwest United States participated in an experimental study in which they viewed a women's health lecture online and then responded to a questionnaire. Results from the observed path analysis indicate that while the teacher's sex did not directly influence perceptions of fit, there was a significant student sex x teacher sex interaction. Among women who viewed the lecture, fit perceptions were higher for the female teacher over the male teacher. Men who viewed the lecture did not vary in their ratings. Finally, the perceived fit of the teacher was reliably related to overall evaluation of instruction. Contributions to theory and practice are discussed.
Pub Date: 06/22/2009
Though strides have been made, women continue to face various forms of workplace discrimination. Earnings represent one such area. According to US Census Bureau estimates from 2004, the median income for men over age 15 was $40,798 while the median income for women was $31,223. The earning differences are not uniform, however, as gender differences in salary grow increasingly disparate the higher one progresses in the workforce (e.g., from being a fitness employee to holding a management position) (see also Cunningham, 2007). These findings are not just limited to the US, but have been observed in Australia (Eastough & Miller, 2004) and Great Britain (Ward, 2001) as well. Women are also likely to be under-represented in top management positions due to discriminatory practices (Stroh, Langlands, & Simpson, 2004). A study by Catalyst (2000) found that women represented only 12.5% of all corporate officers, 11.7% of all board directors, and two Fortune 500 chief executive officers. Thus, despite the gains women have made in shattering the glass ceiling (Stroh, et al., 2004), barriers still exist.

Gender discrimination in the workplace is also observed on college and university campuses. For instance, women are over-represented as adjunct faculty members and are under-represented as tenured or tenure-track professors at the assistant, associate, or full professor rank (Nettles, Perna, & Bradburn, 2000). Wage disparities also exist among men and women in the professoriate. According to the Chronicle of Higher Education (www.chronicle. com/jobs), there was a 12.1% gap in the average earning of men ($106,195) and women ($93,349) at the rank of professor in 2007, and these disparities are present at every level of instruction.

One possible reason for the gender differences in rank and salary seen among college professors is the difference in teaching evaluations men and women receive (Anderson & Miller, 1997; Basow, 1995; see also Sosa & Sagas, 2008). That is, student evaluations of instructors' teaching play an important role in merit allocations and promotion decisions (Seldin, 1999); thus, gender differences in student evaluations would likely account for the aforementioned rank and salary disparities. Interestingly, the evidence of such differences is somewhat mixed. Early research suggested that any gender differences were negligible (Feldman, 1992, 1993). Others have criticized these studies, citing the reliance on main effects and a failure to recognize the socially constructed ways in which men and women are viewed as professors (Abel & Meltzer, 2007; Basow, 1995; Sprague & Massoni, 2005). These more contemporary studies have demonstrated that women face stereotypes in the classroom about who they should be and the characteristics they should exude (i.e., caring and nurturing; see Sprague & Massoni). Further, student characteristics, such as their gender (Baslow, 1995) and attitudes toward women (Abel & Metzler, 2007), shape how they perceive men and women in the classroom (Basow, Phelen, & Capotosto, 2006; Frieze, Ferligoj, Kogovsek, Rener, Horvat, & Sarlija, 2003). Thus, women are held to different standards than are men, and this oftentimes leads to them being penalized in their teaching evaluations (Abel & Meltzer, 2007; Basow, 1995; Sprague & Massoni, 2005).

The foregoing review could lead one to draw some discouraging conclusions: socially constructed beliefs and long-held stereotypes are difficult to change, and as such, women are likely to continually face discrimination and bias in the classroom. We argue here, however, that such conclusions might be misguided and that, in some cases, women might be preferred to men in the classroom. The match-up hypothesis (Kamins, 1990) and associative learning theory (Klein, 1991) inform our arguments.

The match-up hypothesis suggests that people are likely to have more positive evaluations of a target when they perceive a "fit" between that target and the specific context (see Kamins, 1990). For instance, within the advertising literature, researchers have shown that persons who are considered more credible (Ohanian, 1990) and appropriate (Till & Bussler, 1998) are more effective than are their counterparts. A potential consumer might view an athlete endorsing an energy bar (i.e., a high "fit" situation) in a more positive light than the same athlete endorsing a mutual fund (i.e., a low fit situation) (see also Charbonneau & Garland, 2006; Till & Busler, 2000). These "fit" perceptions are important because they are thought to drive subsequent attitudes toward the product (Till & Busler, 2000) and intentions to consume that product (Cunningham, Fink, & Kenix, 2008; Fink, Cunningham, & Kensicki, 2004).

Associative learning theory (Klein, 1991) also speaks to these dynamics. From this perspective, different concepts are linked in people's minds to form an associated network of memory. These networks are important, because once they are connected, one concept is gathered every time the other concept is brought forth (Anderson, 1983; Till & Shimp, 1998). In again returning to the advertising example, this means that when a given endorser is used to market a product, one's experiences and attitudes about both are summoned, and a link is developed. When observed repeatedly, these network connections become stronger, such that over time, when one is observed, the other is brought to mind as well (Till & Busler, 2000). These associations are thought to be strongest when there is a "fit" or "match" between the two concepts, such as the case with the athlete and the energy bar (Lynch & Schuler, 1994; see also Cunningham et al., 2008).

What do the match-up hypothesis, associative learning theory, and advertising effectiveness have to do with teachers in the classroom? We argue here that just as consumers form expectations about who is the most effective product endorser (e.g., the athlete endorsing an energy bar), students also form expectations and have associated networks about who should teach certain classes. In most classrooms and disciplines (e.g., management, physics, chemistry), these linkages will call for a man, as students might hold the default assumption of "men are professors" (Basow, 1995, p. 663). In other situations, however, such as when discussing gender or women's issues, women might be preferred over men. Recall that the match-up hypothesis holds that "fit" is likely to be high when the target is perceived as credible, appropriate, and trustworthy (Ohanian, 1990; Till & Bussler, 1998). When the course content focuses on women and women's issues, students might be more likely to attribute these characteristics to women than to men, as women might hold "first hand knowledge" of these issues thereby making them better suited to discuss them. This pattern is especially likely to hold when the course content focuses on bettering women's health and quality of life as opposed to when the focus is gender inequalities. In the latter case, questions of political bias might serve to limit the woman professor's credibility (Anderson & Smith, 2005; Abel & Maltzer, 2007).

In addition to the "fit" between the teacher's sex and the course content, the student's sex might influence perceptions of the teacher's effectiveness. The social categorization framework (Riordan, 2000; Tajfel & Turner, 1979; Turner, Hogg, Oakes, Reicher, & Wetherell, 1987) suggests that people have a preference for and exhibit more positive attitudes toward those who are perceived to be similar to the self. This pattern has been demonstrated in a number of studies set in the workplace, as researchers have found that people are likely to have high evaluations of similar others (Rink & Ellemers, 2006; Stauffer & Buckley, 2005). Similar effects have also been observed in the classroom, as Basow (1995) found that while the ratings of men were not influenced by the students' sex, ratings of women were: women received their highest ratings from women and their lowest ratings from men in the class. Likewise, Behling, Curtis, and Foster (1982) demonstrated that students working closely with instructors of the same sex evaluated them higher than did students who worked closely with instructors of the opposite sex. Collectively, this research suggests that teacher sex and student sex likely interact to predict subsequent teach appraisals.


In drawing from the match-up hypothesis, the purpose of this study was to examine the influence of teacher sex and student sex on perceptions of fit and teacher effectiveness. To do so, we created a women's health lesson that students viewed on the computer. Specifically, they viewed one of two PowerPoint presentations with "voice-over" instruction from the teacher. The two presentations were identical in the content of the slides and lecture content; the only differences were the sex of the person delivering the lecture and the corresponding photo of the instructor in the corner of the screen.

Examining potential gender differences in teacher evaluations in this manner has several benefits beyond previous studies. First, unlike field studies of actual classroom instruction (e.g., Baslow, 1995), we were able to standardize the content of the lesson and eliminate so-called masculine and feminine teaching styles (e.g., Hull & Hull, 1988), thereby providing greater internal validity. Second, our study moves beyond other experimental studies of teacher effectiveness (e.g., Abel & Meltzer, 2007) by providing a more realistic medium of instruction. In their study, Abel and Meltzer (2007) asked students to read a lecture provided by a hypothetical teacher and then to respond to questionnaire items. In our study, students received the instruction through a web-based delivery, an increasingly common form of instruction and training (Sitzmann, Kraiger, Stewart, & Wisher, 2006). Thus, while we increased our internal validity through the standardized instruction, we were also able to improve the ecological validity beyond other experimental studies by using our web-based approach.

In terms of our hypothesis development, the match-up hypothesis and associative learning theory both suggest that reactions to a target are likely to depend on the context in which the target is situated; for example, reactions to an athlete as an endorser are likely to be more positive when the product is sport related relative to when it is not (Charbonneau & Garland, 2006; Till, 2001). In relating this perspective to the current study, we expected that similar effects would be seen for the instruction of the women's health class, such that women were likely to be viewed as more credible and trustworthy than were men and, as a result, be perceived as a better fit for the class. More succinctly, we predicted that women would be perceived as a better fit as a teacher of the women's health class than would men (Hypothesis 1).

Also recall that the social categorization framework suggests that people are likely to have more favorable evaluations of others who are similar to the self (Riordan, 2000; Tajfel & Turner, 1979; Turner et al., 1987), and previous research has provided some support for this contention (Basow, 1995). In drawing from this literature, we predicted that student sex would moderate the relationship between teacher sex and perceived fit (Hypothesis 2).

Finally, fit perceptions are thought to be important because they drive subsequent attitudes (e.g., positive attitudes toward the product; Till & Busler, 2000) and behaviors (e.g., intentions to consume that product; Cunningham et al., 2008; Fink et al., 2004). In the context of the current study, persons who believe the teacher is a good fit for the class might also have more positive evaluations of that teacher. Thus, we predicted that fit would be positively related to subsequent evaluations of instruction (Hypothesis 3).



Students (n = 98) enrolled in an undergraduate health course voluntarily participated in the study. The sample consisted of 43 women (43.9%) and 55 men (56.1%). Most of the participants were White (n = 71, 72.4%), followed by Asian (n = 10, 10.2%), African American (n = 7, 7.1%), Hispanic (n = 7, 7.1%), and others (n = 3, 3.0%). The mean age was 20.49 years (SD = 1.85), and there was a relatively even distribution of first year students (n = 28, 28.6%), sophomore (n = 27, 27.6%), juniors (n = 19, 19.4%), and seniors (n = 24, 24.5%).


Students viewed a lecture entitled "Sexually Transmitted Infections and Women", which mainly focused on (a) the primary burden of sexually transmitted infections in women, (b) how those infections are transmitted, and the (c) the main types of infections, including their signs and symptoms. The presentation was made using a "voice-over" PowerPoint, where the students saw the PowerPoint slides and heard a person's voice narrating. A picture of the supposed teacher (Mary Williams or Doug Williams) was always present on the upper left portion of the screen. The lectures only differed by the picture and the person giving the lecture. Otherwise, every aspect of the lecture was the same, including the script from which the teacher drew. A sample slide is presented in Figure 1.

After listening to the lecture, the students were then directed to an online questionnaire, which asked them to provide their demographic information and to respond to items concerning the fit of the teacher to the lecture and the evaluation of the overall lecture. The multi-item measures were assessed on a 7-point Likert-type scale from 1 (strongly disagree) to 7 (strongly agree). The item mean was used to represent the final score for each measure.

Fit was measured with four items adapted from Till and Busler's (2000) and Cunningham et al.'s (2008) studies. While their original items were developed to assess the appropriateness of persons endorsing specific products, we adapted the items to fit the classroom context. The four items were: "I think this instructor is a good match for teaching this lesson", "I think the combination of this instructor and the lesson content is a good fit", "The instructor is well-suited to teach this lesson", and "The instructor corresponds well with this topic area". The reliability of the scale was high ([alpha] = .94).

Evaluation of instruction was measured with five items developed for the study: "I found the lesson to be very informative", "I learned a lot from this lesson", "I thought the lesson was poor" (reverse scored), "I would be pleased to take a course similar to this one in the future", and "Overall, I found the lesson to be quite effective". The reliability for this measure was also high ([alpha] = .88).


Means, standard deviations, and bivariate correlations were computed for all variables. We tested the hypotheses through observed path analysis using AMOS 16.0 (Arbuckle, 2007), a preferred approach for simultaneous testing of hypotheses in experimental studies (see also MacCallum & Austin, 2000). In specifying the model, teacher sex (0 = male, 1 = female), student sex (0 = male, 1 = female), and the teacher sex x student sex interaction term served as exogenous variables while fit and evaluation of instruction served as endogenous variables. The root mean square error of approximation (RMSEA) and comparative fit index (CFI) were both used to examine how well the data fit the model.



Descriptive statistics are presented in Table 1. The mean scores for both fit (M = 5.72, SD = 1.23) and evaluation instruction (M = 5.39, SD = 1.18) were both greater than the midpoint of the scale (4), suggesting that, as a collective, students enjoyed the lecture. From a direct effects perspective, the teacher's sex was not related to either perceived fit (r = .19) or evaluation of instruction (r = .02). The latter two variables did hold a significant, positive association with one another (r = .70).


Hypothesis Testing

An illustrative summary of the path analysis is presented in Figure 2. Results indicate that the model was a close fit to the data: [chi square] (3) = 4.95, p = .18; [chi square]/ df = 1.65; RMSEA (90% CI: .00, .20) = .08,p close = .26; CFI = .99. In all, the model explained 12.0% of the variance in fit and 49.2% of the variance in evaluation of instruction.

Hypothesis 1 predicted that women teaching the women's heath course would be viewed as a better fit than would men. This hypothesis was not supported, as there were no differences in the fit ratings of women and men ([beta] = .01, p = .95).

We did find, however, that student sex moderated the aforementioned relationship, as the teacher sex x student sex interaction term was significantly related to fit ([beta] = .35, p < .05). The nature of the interaction is depicted in Figure 3. Evaluations of the male's instruction did not differ between male and female students. However, female students rated the female teacher's instruction higher than did males. Thus, hypothesis 2 was supported.

Finally, Hypothesis 3 received full support. The more students believed the teacher was a good fit with the lesson, the better they evaluated that person ([beta] = .70,p < .001).


The primary purpose of this study was to examine the relationships among gender, perceptions of fit, and teaching evaluations within the educational setting. Applying the match-up hypothesis (Kamins, 1990), associative learning theory (Klein, 1991), and the social categorization framework (Riordan, 2000; Tajfel & Turner, 1979; Turner, et al., 1987), our results identify a variable previously overlooked in discussions of the gender disparities that exist amongst college professors: the perception of fit. Specifically, our findings reveal that the gendered nature of viewers' fit perceptions carry important implications for instructors and to some extent, students. These implications are discussed in greater detail below.


That the difference of fit perceptions for female and male instructors was not statistically significant in general suggests findings of practical importance. Specifically, and as Suen (1992) suggests, it indicates uncertainty, some of which is reduced by identifying the importance of individual characteristics (i.e., supporting Hypotheses 2 and 3). It is also plausible; however, that additional variance could be explained simply by considering the socially constructed assumptions and stereotypes accorded to the teaching profession itself. As mentioned previously, the default conceptualization of a professor is a man (Basow, 1995). Others have suggested, however, that course content moderates this relationship. Early research by Mackie (1976), for example, demonstrated that students perceived males as more competent business professors than females and females more competent as professors within the humanities and social sciences. Basow and Silberg (1987) found that male and female students evaluated male faculty members in the natural sciences higher than female faculty members. Consistent with these findings, our results suggest that biases may still exist within the teaching and professorial fields. Simply put, the assumption may still exist that men are professors, but women too can/may be professors when the content is related to women exclusively. To the extent that broad gender-based assumptions and stereotypes inform student's evaluations of teachers (e.g., Basow et al., 2006), their relationship to perceived fit warrants further inquiry.


As mentioned above, the failure to support our first hypothesis would lead some to claim that (a) the match-up hypothesis lacks utility in the educational context and that (b) the assumption of males as professors is no longer valid. However, our support of Hypotheses 2 and 3 refutes these points. Female viewers perceived female instructors as a better fit for a lesson on women's health than did male viewers. Female instructors were also evaluated more positively by female viewers than they were by male viewers. These findings extend upon previous works by identifying both similarity and perceived fit as potential explanations for differential evaluations of instructors. For example, Strange, Oakley, and Forrest's (2003) study of sex education preferences amongst female student's revealed that female teachers are sometimes preferred, as they have the ability to relate to young women better than male teachers. Thus, female students may have viewed the female instructor as more relatable to oneself, more credible, and more appropriate for addressing women's health issues (Klein, 1991; Ohanian, 1990; Tajfel & Turner, 1979; Till & Bussler, 1998; Turner et al., 1987).

The finding that females viewed the female teacher as better suited to teach the women's health lecture may also have important implications for the health behaviors of young women. According to the Centers for Disease Control and Prevention, the highest prevalence of sexual transmitted infections amongst sexually active females occurs among women 20 to 24 years of age. Research also suggests that meeting the health needs of a certain group within a particular context is best accomplished by first assessing the target group's attitudes toward educational approaches (i.e., identifying individual needs and preferences; see Kim & Free, 2008). Thus, by demonstrating that the college-aged females in the current study viewed the female instructor as better suited and more favorably than the male instructor, we identify a potential need that female college and university student's possess. Namely, aligning professor sex with lesson and/ or content may ensure that young females get the most from educational information regarding their sexual health. Additional work is needed to assess the potential effects of this fit on subsequent learning outcomes and student behaviors.


Despite the strengths of the study, there are some potential limitations. First, the use of online instruction in this study may elicit criticism. Likewise, our web-based instrument may draw scrutiny. As mentioned previously, however, our web-based approach was timely and valid (Dillman, 2000; Sitzmann et al., 2006). Further, it is practical, as the use of web-based instruction on the topic of women's health is used amongst medical students themselves (see Zebrack, Mitchell, Davids, & Simpson, 2005). By standardizing our lesson across participants, we were able to control for several intervening variables. As such, we enhanced our internal validity. Utilizing web-based instruction and assessment also allowed us to generate participant responses through a realistic instructional medium. Thus, our ecological validity was also bolstered. While acknowledged that these methodological strengths also come with weaknesses, our design allowed us to contribute to the literature by identifying a previously overlooked aspect of the teacher-student relationship; perceptions of fit.

While we had great strength in our internal and ecological validities, the geographic region in which this study took place may limit our external validity. Participants were students enrolled at a large Southern university, rich in conservative traditions. Reflective of this culture, the recall of and adherence to traditional sex-role stereotypes by these students may be different than for students at other universities. As such, perceptions of fit toward male and female professors may also differ. The findings from Cifuentes and Smith's (2001) study of culture and online education support this rationale, as the authors demonstrated that cultural meanings informed expectations of professors and students. Thus, while the current study contributes to the literature by identifying perceptions of fit as an important factor within education, we encourage future projects to incorporate broader social and contextual meanings into their inquiries.


Taken together our findings identify an important factor involved in the historically gendered evaluative process of teachers: perceptions of fit. Specifically, and consistent with the match-up hypothesis, associative learning theory, and the social categorization framework, fit perceptions were influenced by student characteristics and course content such that female students perceived a better fit between the female instructor and the women's studies lecture than between the male instructor and the women's studies lecture. Consequently, female students evaluated the female instructor higher than her male counterpart. These findings, and the manner to which they were gathered, contribute to the literature by identifying both the theoretical and practical significance of "fit" within the classroom.


George B. Cunningham, PhD

Melanie L. Sartore, PhD

J. Don Chaney, PhD

Elizabeth Chaney, PhD

George B. Cunningham, PhD, is affiliated with Laboratory for Diversity in Sport at Texas A & M University. Melanie L. Sartore, PhD, is affiliated with Department of Exercise and Sport Science at East Carolina University. J. Don Chaney, PhD, is affiliated with College of Health and Human Performance, Department of Health Education and Behavior at University of Florida. Elizabeth Chaney, PhD, Department of Health Education and Behavior at University of Florida. George B. Cunningham, PhD, Laboratory for Diversity in Sport, Texas A & M University, 4243 TAMU, College Station, Texas 88843-4243, Phone: 979.458.8006, Email:
Figure 1. Sample slide from women's health lecture.

The Primary Burden of STIs

* STIs, once called venereal diseases, are
  among the most common infectious
  diseases in the United States today.
* More than 20 STIs have now been
* ~13 million men and women are affected
* They are most prevalent among teenagers
  and young adults. Nearly two-thirds of all
  STIs occur in people younger than 25 years
  of age.

Table 1

Item                M      SD     1      2        3      4

1. Teacher sex     .55    .50    --
2. Student sex     .44    .50    .05    --
3. Fit             5.73   1.23   .19   .22 *     --
4. Evaluation of   5.39   1.18   .02   .26 *   .70 ***   --
