Sunday, January 3, 2016

Correlation and Causation Revisted

NOTE: I am trying to fix the links in a number of old posts, and have reproduced a number of essays from my now defunct blog in order to allow myself to create working links to them.

I wrote on this topic before, but as it has arisen again in the comments to one of my posts, I feel the need to examine it once more.

I wrote earlier that I find many reasons to doubt the genetic theory of homosexuality. In the comments to that post, a reader disputed my conclusions, first arguing that the cause was biological but not genetic and then offering as proof studies that showed that male children having more elder brothers by the same mother were more likely to become homosexual. Having not seen the studies myself, I take this claim at face value, but I still have to say it is far from conclusive. It is s simple correlation, with no mechanism offered to explain why such a pattern would favor homosexuality or how.

Actually, the claim is less impressive than it sounds, as birth order often effects behavior. Now, the reader did point out it holds only for children born to the same mother, but that still does not dismiss the possibility of a psychological explanation, as full siblings and half siblings, as well as intact families versus composite families, make many differences in a child's behavior. As I responded, children with more brothers learn to ride a bicycle earlier, tend to be more comfortable with fighting, have a different self-image, learn to speak earlier and so on. That there would be additional differences depending on whether they are full or half brothers does not surprise me. At the very least it does not force me to conclude the correlation proves there is some biological causation at work. In fact, being a simple correlation with no explanation offered, it could simply be a coincidence.

And that brings me to my point. Correlation is not causation. As I pointed out before, the fondness of sex offenders for pornography does not prove pornography causes sexual violence. It is just as possible that the same trait causes both. Or perhaps a predisposition to sexual violence leads one to favor pornography. Or it is even possible (though unlikely in this case) that the two are simply coincidental with no relation between them.

And that is the case here. The studies say nothing more than that a child with more full brothers will tend to be homosexual more often. Now, yes, it is possible that in having a number of male children a mother's body somehow changes to produce a more homosexual friendly gestational environment, though the fact that no mechanism is provided makes me wonder exactly how this would work. On the other hand, that a child with more full siblings, which presumes an intact household, would be homosexual due to psychological factors is quite possible as well. After all, he would be "Momma's little baby", mocked by his brothers for the doting his mother bestows on him, it easily allows one to imagine all manner of psychological factors which would favor turning from traditional male roles. Not that I am suggesting that is the cause, just that it is relatively easy to imagine such explanations. In fact, it is easy to see why full brothers would exert more influence than half brothers, as half brothers, being alien to half of the family would lack the comfort necessary for such teasing, causing them to have much less influence.

Or, we can reject such pop psych theorizing, as I do in reality, and propose that this is simply one of those coincidences which occur when one starts gathering statistics. I don't know how familiar the average reader is with statistical studies, but it was an area that fascinated me in college and afterward, so I have seen a lot of such interesting patterns that prove to be absolutely meaningless. 

One example is caused by the poor understanding people have of random chance. As has been pointed out before, if you take one marble for each case of childhood leukemia and roll them onto a map of the US, there will be "clusters", and by chance some of those clusters will be near one of the 100 largest chemical waste dumps. However, that does not prove the dumps caused leukemia, as some of those clusters will also be near one of the 100 largest bowling alleys. Chance does not mean numbers will be distributed evenly, an even distribution is far form random. No, random data will form clumps and empty patches, and depending on how you draw your circle, you can clearly find something over or under represented within the circle around whatever object you choose as a center. 

But even if we assume that the pattern described is not just due to chance, without any proposed explanation, how do we know there is a causative relationship? Many times two factors occur together, in fact it is so common that statisticians developed the term autocorrelation to describe it. However, correlation does not mean there is a causative relationship. Sometimes there is,  though sometimes it is unclear in which direction the causation flows. However, many times, both are the outcome of a separate cause. Or perhaps there is no clear causation, but the two are both more probable int he presence of a third factor. (And sometimes, as I mentioned before, there is just chance. Though we are excluding that for now.)

And that is the problem with this argument, there is no mechanism proposed. If there were, for example, that depletion of maternal testosterone leads to the feminizing of the embryo and fetus leading to a confused sexual identity (or whatever mechanism is proposed, I am simply pulling something out of thin air here, and trust me it is about as close to nonsense as you can get, and I freely confess as much), then we could test for that more directly and avoid such vague tests. We could use maternal blood work to determine if children were more often born homosexual from mothers with low testosterone levels*. But without such a mechanism, we are left with these vague statistical surveys, which can be as often explained by psychology as biology, and even more often explained by chance.

But, as I said, I wrote on this before, several times, and I think I have probably said more than I need to on this single topic. So, for now, I will leave this topic alone. Should it come up again, perhaps I will write a more general examination. For now I simply wanted to point out a problem I have often seen with arguments from statistical sources.

And now I hope to get back to completing all those essays I promised to write when I got back from vacation. I think I have completed three of seventeen, so I have a lot of work ahead of me.


* Then again, as maternal hormone levels change her behavior, such a study would still not exclude the possibility of psychological explanations.


To avoid the clutter of lots of links, here are my past posts on related topics:

Correlation versus Causation
Violence and Culture
A Common Mistake
Old Ideas?
Sampling Changes and Fictional Trends 
As should be clear, I have tried to avoid statistical topics. Though I am a fascinated amateur, I know most others find the topic rather dull.

Originally posted in Random Notes on 2008/12/15.

No comments:

Post a Comment