Girls did indeed tend to speak more readily of the animated shapes as if they were thinking, feeling characters in a story, though for the most part this trait did not have any relation to amniotic testosterone levels. Girls also scored higher on the Empathy Quotient, but their scores likewise bore no relation to prenatal testosterone. (There was such a relationship for boys, however). Only the Reading the Mind in the Eyes Test results showed the expected negative correlation with amniotic testosterone, and that test failed to show the expected sex difference in favor of girls.
Empathy --- or the lack thereof --- is crucial to Simon Baron-Cohen's conception of autism, and also to his conception of the "essential difference" between the sexes.
In this 2006 paper published in Social Neuroscience (full text here), a group of researchers affiliated with Cambridge University looked for a relationship between children's scores on two measures of empathy, the Reading the Mind in the Eyes test and a children's version of the Empathy Quotient (EQ-C), and the amount of testosterone that was present in their mothers' amniotic fluid.
The two tests were given in separate experiments, to different groups of children whose mothers all underwent amniocentesis in the Cambridge region of the U.K. between June of 1996 and June of 1999; at the time of testing, the children were all between six and nine years old. The EQ study involved 193 children --- 100 boys and 93 girls --- and had those children's parents fill out the questionnaire about the children, rather than having the children take the test themselves. Three outliers --- one boy and two girls --- were left out of the statistical analysis, leaving 99 boys and 91 girls.
(Figure 1, in Chapman et al., 2006 --- scatter plot showing the distribution of boys' and girls' scores on the children's Empathy Quotient in relation to the amount of testosterone, expressed in nanomoles per liter, present in their mothers' amniotic fluid)
There were significant sex differences in EQ-C scores; the boys' average score was 32.62 (out of a possible 58), while the girls' was 39.12. That's a difference of a little less than a standard deviation (roughly), with effect size d = 0.76. Fetal-testosterone levels correlated significantly with EQ-C scores when both sexes' data was analyzed together, and also in the within-sex analysis for boys; for girls, however, there was no relation between fetal testosterone and EQ-C score.
Due to the very close relationship between two of the variables being looked at --- amniotic testosterone level and sex (which in this study were correlated with r = -0.65) --- the researchers did further analysis to try and isolate the effect of each one:
A combined sex analysis showed there to be a significant negative correlation between fT level and performance on the EQ-C: r(193) = -.28, p < .01. However, there was also a significant difference between girls' and boys' EQ-C scores. Within-sex analyses revealed that there was a significant correlation between fT and EQ-C scores for the boys: r(99) = -.35, p ≤ .01, but not for the girls. The fact that a correlation is observed between fT and EQ-C for the boys may in part be due to a larger variation in fT levels for boys (0.10-2.05 nmol/L) compared to girls in this study (0.05-0.85 nmol/L). We investigated the influence of fT and sex on EQ-C scores by running a stepwise analysis, which revealed a main effect of sex, but not fT in the final model. The strong correlation between sex and fT means that fT cannot be ruled out as a factor in producing the observed sex difference, but it is clear that the sex difference is larger than that which would be predicted by fT alone.
I can think of something that might be at work here, that would explain a wider gap between the sexes than differences in amniotic testosterone levels would account for: remember that the metric used to measure empathy in this (quasi-)experiment, the Empathy Quotient, isn't a direct assessment of a skill but rather asks you to agree or disagree that various general statements (like "I can easily tell if someone else wants to enter a conversation" or "It is hard for me to see why some things upset people so much") describe you. Also remember that, in this instance, the person answering the questions isn't answering them about hirself, but is answering them for hir children. Both of those aspects of the EQ-C allow for quite a bit of subjective wiggle room --- even when you are answering questions about yourself, if the questions are fairly general in nature (like "Are you a 'people person'?" or "Are you good at math?"), the same person might answer them differently at different times or in response to different cues. (One of these cues, as Cordelia Fine persuasively argues in Delusions of Gender, is the checkbox at the top of many standardized tests --- including the EQ and SQ --- that asks you to specify whether you are male or female).
The "Mind in the Eyes Test" experiment involved a much smaller study population: 39 boys and 37 girls, from the same cohort of six-to-nine-year-old children as the participants in the EQ-C study. The test involves looking at a series of pairs of eyes and choosing (from four available words) the word that best describes what the owner of those eyes is feeling.
Here are some of the eyes:The children's mean scores didn't really differ between the sexes --- 15.23 (out of 28 possible) for boys versus 16.29 for girls, with standard deviations of 3.50 for boys and 3.29 for girls --- but, looking at the graph, I get the impression that, although the score distributions for each sex are pretty much right on top of each other, there seem to be more low scorers among the boys than among the girls.
(Figure 2, in Chapman et al., 2006 --- scatter plot showing the distribution of boys' and girls' scores on the children's version of the Mind in the Eyes Test in relation to levels of testosterone present in amniotic fluid)
Weirdly, although no sex differences were found, the researchers did see a relationship between amniotic testosterone levels and Mind in the Eyes test scores. This relationship held true within each sex, as well as in the combined-sex analysis. The correlation was somewhat weaker for the girls, though: r = -0.29, as opposed to the boys' r = -0.42 and both sexes' r = -0.43.
There was also another variable that correlated significantly (r = 0.29) with performance on this test: the age of the child. That just makes sense --- children's verbal fluency, degree of insight and social intelligence all improve as they develop. (They also measured the children's IQs, and found no relationship between verbal IQ and test scores, though. So that aspect of development seems not to be implicated in this study).
Another study (full text here) published that year in Hormones and Behavior --- by many of the same researchers who worked on the one I already described --- looked for a relationship between amniotic testosterone and a different, somewhat whimsical, measure of empathy: they looked for the proportion of words relating to mental states that children used to describe the movements and interactions of cartoon shapes in a short, wordless film.
Here's a description of the short films the children (25 boys and 14 girls, all four years old) were shown:
Computer-presented animations were provided by Fulvia Castelli and were used in the studies by Abell et al. (2000) and Castelli et al. (2000). The animations showed one large red and one small blue triangle moving around a screen which contained a rectangular enclosure. One animation (random) showed the triangles moving about purposelessly (bouncing off the sides) and not interacting with each other (this was chosen from a possible set of 4). Two animations were designed to convey ToM (these were taken from a possible set of 6). One film showed the big triangle coaxing the little one out of the enclosure (see Fig. 1). One showed the little triangle hiding behind a door and surprising the big triangle. Sequences lasted 34 to 45 s each. Children watched the film once and were then asked what was happening. They were then asked to describe the film as it was playing (to reduce the memory burden). For the random film, only these initial descriptions were recorded. For the ToM films, after the first descriptions were given, children were asked a series of questions designed to elicit more information and to encourage them to view the sequence in human terms. ... The children's narratives were tape-recorded and then transcribed.
Later, researchers who hadn't participated in the interviews read through the transcripts and counted the number of "mental-state terms" (words describing thoughts, plans, wishes, motives, etc.), "affective-state terms" (words describing feelings), "intentional propositions" (phrases describing the animated shapes as characters doing things), and "neutral propositions" (phrases that just describe how the shapes are moving, or what they look like, without any attempt to turn them into characters in a story) that appeared.
Because an earlier study found that autistic children were less likely than typically developing children to ascribe mental states to similar animated triangles, the authors of this study (Rebecca Knickmeyer, Simon Baron-Cohen, Peter Raggatt, Kevin Taylor and Gerald Hackett) predicted 1) that the girls in this study would use more mental- and affective-state terms than the boys, and 2) that, within each sex, the concentration of testosterone found in each child's mother's amniotic fluid would vary inversely with that child's use of mentalizing language.
While they pretty much found the sex differences they expected to find (the girls used significantly more affective-state terms than the boys did (d = 0.82), while the boys used significantly more neutral propositions than the girls did (d = 0.63), and there was a trend that just barely missed the cutoff for statistical significance (d = 0.62) for the girls to use more intentional propositions), their results were more ambiguous in their support (or nonsupport) of the proposed link with fetal testosterone exposure.
Two of the outcome variables, mental-state terms and affective-state terms, showed no relationship to amniotic testosterone levels whatsoever; two others, intentional propositions and neutral propositions, showed a significant relationship* to testosterone when all the subjects' data were analyzed together, but that relationship failed to show up in the within-sex analyses**, which suggests that the relationship observed for the pooled data might just be an artifact of the sex differences I described in the last paragraph.
These data, along with the data from the first study I described in this post, provide only incomplete, ambiguous support for the idea that testosterone exposure during development suppresses a person's ability to imagine another person's mental state.
Both studies also fail to make the case that the link with testosterone exposure is strong enough to stand on its own, rather than simply being another indicator of sex differences, which may have any number of biological, psychological, social or cultural causes. Is testosterone driving this sex difference, or merely reflecting it?
*They tested for this relationship in two ways: first, they just looked for Pearson correlations (r) between all the variables they had measured; next, they used hierarchical regression analyses to try to isolate the contributions of individual variables --- to test each one to see how well it predicts how much of each type of language any given child's description will contain. Only one of the Pearson correlations --- that between testosterone and use of intentional propositions --- was significant, at r = -0.43. The other kind of significant relationship Knickmeyer and colleagues are talking about is one that produces a significant F change when it is included in their statistical model. Fetal testosterone produced such a change for both intentional and neutral propositions.
**Neither sex showed such a relationship when use of intentional propositions was the outcome variable of interest; when they looked at neutral propositions, there was a significant relationship to amniotic testosterone levels among the boys, but not among the girls.
Chapman, E., Baron-Cohen, S., Auyeung, B., Knickmeyer, R., Taylor, K., & Hackett, G. (2006). Fetal testosterone and empathy: Evidence from the Empathy Quotient (EQ) and the "Reading the Mind in the Eyes" Test Social Neuroscience, 1 (2), 135-148 DOI: 10.1080/17470910600992239
KNICKMEYER, R., BARONCOHEN, S., RAGGATT, P., TAYLOR, K., & HACKETT, G. (2006). Fetal testosterone and empathy Hormones and Behavior, 49 (3), 282-292 DOI: 10.1016/j.yhbeh.2005.08.010