Bayes Theorem

Conversely if a typewriter turns up that does have both proportional spacing and changable fonts then the probabiltiy of a hoax declines. As Kevin says, the fact that nobody knows of such a typewriter doesn't mean one doesn't exist. However, consider this: How many people read these weblogs? How many people read Instapundit, Atrios, Kevin Drum, and so forth? Maybe a million? What are the odds that if such a typewriter exists, that not nne of these people have heard of it? Seems pretty small to me. I'd say we are on firm footing saying no such typewriter exists. This means that even a die-hard believe anything that tars Bush type of person should be revising their initial assesments of these documents downwards (or at the very least not upwards!). If you are not you are irrational. You are refusing to believe what the evidence is telling you. You are as dogmatic as any Young Earth Creationist.

So unless somebody comes up with a typewriter that can do both proportional spacing and change fonts, DailyKos poster Hunter is being irrational. Atrios is being irrational. And of course, Oliver Willis lost it awhile ago. Also, lets not forget that Kevin has a list of items that make him skeptical of the TANG documents. Each of Kevin's points would keep driving the probability of "no hoax" down, and the probability of a "hoax" (i.e., forgeries) upwards. Kevin is right to be skeptical. Atrios, Hunter, and Willis are being really stupid.

(Cautionary Note: The numbers above are hypothetical numbers. Their purpose is to highlight how approaching this issue form a probabilistic stand point works. As new information comes out that appears to hurt your initial stance you should lower the probability of that initial stance being the correct one. We are not seeing that by and large from the "Left". At the same time, these things are not necessarily conclusive proof of a forgery/hoax, but it does look quite bad for CBS and Dan Rather.)

Update: I thought an update on iterative applications of Bayes Theorem is also in order. Kevin, as noted, has a number of bullet points that indicate something fishy going on here. If we assume that each bullet point has the same effect as the one quoted above about proportional spacing and changable font, then after only three iterations of Bayes Theorem the probabilities have reversed themselves. That is after three of the bullet points you should be highly skeptical of the TANG documents (assuming my numbers above are reasonable). By the fourth one the probability of forgery is getting quite close to one.

Somebody Figured Out How to Use A Calculator

This is quite similar to what I have noted for solar energy for generating electricity. Imagine covering acres and acres with solar panels. It should be obvious to anybody with a brain that doing something like this would result in a huge environmental impact. Thus, the unpleasant conclusion is that environmentalists have no brains. But lets not get sidetracked here, the topic is biofuels from crops.

One thing that Monbiot missed was fertilizers and pesticides. With a massive increase in farming there would also be a masive increase in the use of fertilizers and pesticides. These of course, present environmental problems.

The bottomline is that Monbiot is actually right about this topic...for once. Oh well, the blind chicken and the kernel and all that.

The Likelihood Principle and Statistics

I've had a couple of requests to give some ideas of how the likelihood principle works when applied to statistics. One manner of implementing the likelihood principle is to use Bayesian approach.

The Bayesian approach to statistical inference starts with what is called a prior probability. That is suppose we have two hypotheses, H1 and H2. We assign probabilities to these hypotheses being true, P(H1), and P(H2).

Then an experiment or data are gathered (call this evidence E). Then based on the following conditional probabilities P(E|H1)and P(E|H2) the following probabilities can be calculated

The conditional probability P(E|Hi) is the probability of observing the evidence E, given one of the hypotheses in question. The probability P(Hi|E) is the probability of hypothesis Hi being truce given that we observed evidence E.

Now note that the only evidence we take into consideration is the evidence that is observed. Evidence that could have been observed, but was not is not considered. In this way, Bayesian inference is in agreement with the likelihood principle.

Now take advantage of the Theorem of Total Probability (see page 207 of Real Analysis and Probability, by Robert Ash) we can rewrite the denominators above as simple P(E). The ratio's,

measures the impact of the evidence on the hypothesis in question. For example, if it is unlikely to observe E unless the hypothesis is true then the above ratio will be large. A large chaturbate ratio will increase the probability P(Hi|E).

To make this somewhat concrete suppose we have two hypotheses. The first one is that autism is caused by tuberous sclerosis. The second hypothesis is that autism is caused by thimerosal in vaccines. Now, we have to pick a prior to represent our beliefs that either of these hypotheses is true. If we had some reason to suspect one hypothesis of being true over the other we could give that hypothesis a higher prior.1 As it is, lets start with a prior that assigns the same probability to each hypothesis (i.e. a prior of 0.5).

Now, we also need to know what is the probability of observing E, given each hypothesis. That is what is the probability of observing a having autism if it is the vaccine hypothesis is true, vs. the probability of observing a chile with autism due to tuberous sclerosis. Let's assume the probability of observing autism given that the tuberous sclerosis is true is 0.25 and 0.05 if the vaccine hypothesis is true. Then using the Bayes theorem calculator we get the following probabilities for if we observe a with autism:

P(Autism|Vac) = 0.167

P(Autism|TS) = 0.833.

In other words, observing a with autism lends support to the tuberous sclerosis hypothesis and takes support away from the hypothesis that it is vaccines.

We can use these new probabilities (called posteriors) and use them as new prior probabilites when we obtain more data. Thus, the Bayesian approach provides a mechanism that allows one to consider new evidence and its impact on your beliefs. Frequentist statistics does not really have such a mechanism. Once you do a given study, there is no "updating" mechanism is a new study is done. Given the initial study, there is no way to take the findings of that study and incorporate them into a new study in terms of the statistics.

Thus, the Bayesian method allows for "learning". This is precisely how the Bayesian spam filters on many e-mail programs work. As the jasminlive user marks more and more e-mails as spam or ham, the filter learns and is able to automatically flag e-mails as spam even before they end up in the inbox.

As for the autism example, one glitch is the rise in the rates of autism over time. Many point to this event and the rise of vaccines as evidence supporting the vaccine as a cause of autism. While it maybe true, the best way to go about this is to look at the alternative hypothesis such as a rise in tuberous sclerosis or the other factors that can play a role in changing the risk factors for autism. Another possibility is the mercury content in fish. While there has been no connection found, I would argue that the same holds for thimerosal as well. This is definitely an area that needs more research and I would argue that analyzing the data using Bayesian methods would be better than Frequentist methods.

One of the things that puts off many people from using Bayesian methods is the amount of work that goes into it. There are quite a few statistical software packages that have Frequentist techniques built into them. It is quite easy for even somebody with little or no understanding of statistics to dump some data into them and crank out results. Bayesian methods on the other hand require quite a bit of thought from the research prior to obtaining any results. That is, there is no mindless number crunching.

1Some might object to the term belief, but beliefs are formed what we all have when it comes to scientific hypotheses. We believe one hypothesis is true vs. the other. Also, the idea of picking prior probabilities leaves some doubtful of the efficacy of Bayesian methods, but even a high initial prior probability will be swamped if the evidence continually favors the other hypothesis.

An Angry Little Man

I was reading InstaPundit the other day and eventually wound up at this post by Brian Leiter. Immediately I was reminded of a quote that arguments in academia are so vitriolic because they are over issues that are so small. In reading Leiter I thought that the basic idea of the quote could be expanded that the most vitriolic academics tend to be vitriolic because they are largely irrelevant. Lets use Brian Leiter as an example? Is he important? Not at all. If he were to fall of the face of the earth tomorrow the set of people who would care would be of measure zero. Is his work important? No again. If the sum total of everything that Leiter has written were to disappear would anybody notice? Again a set of measure zero. So here is a guy who is pretty darn smart who simply does not matter. Do you think it annoys this Leiter that "moral cretins and self-important poseurs" who are "full of shit" get more attention? I'm sure it does. I see his vitriolic posts as a sort of "look at me!" kind of post.

Leiter excoriates millions of people simply because they have not vocally passed judgment on Michelle Malkin's book (you can read a critique of Malkin's book here, and for the record Muller seems to have a strong argument that Malkin is greatly overstating her case). The thing is, bloggers comment on what is important to them. Is the fact that Dan Rather used forge documents important to me? Yeah. Is Michelle Malkin's book important to me? No. I read Michelle Malkin's blog on occasion, but over the years I've found here writing to be...inaccurate at times (particularly her coverage of the CA energy crisis). Muller has done an admirable job digging into this, and it isn't like it is totally ignored by the blogosphere. I first got to Muller's work via InstaIgnorance and I saw a reference to Muller's work over at Oliver Willis' blog. I'm sure if I looked elsewhere I'd find many other references. And I think Gordon Smith summarizes the point nicely as to why Malkin's book isn't an issue. Generally speaking, people aren't going to read and say, "Oh...wow, it was right to round up all the ethnic Japanese, deprive them of their property and toss them into concentration camps all based simply on their heritage vs. their actual guilt. Lets do it to all muslims!" The jasminelive public perception is overwhelming that internment was a black spot on America's record.

So the bottom line here is Leiter is wrong. And he knows he is wrong and that is why he is so angry. That and the Liberal establishment is taking a beating right now. Kerry's campaign is in trouble. Democrats are completely in the minority in all three branches of government. So in response we get this kind of nasty screed that attacks everybody for not thinking exactly like Brian Leiter.

Now, lets get into looking at some of Brian Leiter's own misrepresentations. First we have this example.

Looks like Prof. Leiter is guilty of being a "moral cretin and self-important poseur". The bottom line is if you are really sick in the U.S. you will be taken care of. You might die, but not because you did not receive medical care. Leiter is repeating a favorite Lefty lie. One that should be obviously false. This shows that Brian Leiter is an intellectual moron. He is an intellectual moron because he didn't think clearly about this.

Next up we have this piece of drekh. Here Brian Leiter attempts to refute Arnold Kling and laughing engages in the same mistake he excoriates Kling for.

Whoa. I didn't now that Kling's article was actually to be a peer reviewed academic article. But laughingly, we have this tripe from Leiter,

But where is Prof. Leiter's evidence? Nowhere in sight. What kind of a dimbulb excoriates another for not providing evidence then fails to provide evidence himself? Brian Leiter, that kind of dimbulb.

For the record hours worked per week have been declining for decades Prof. Leiter. One has to wonder whether the University of Michigan and the University of Texas are "not happy about this"? I mean do they usually like it when their professors run around advertising their ignorance in public like this?

Further, the point that the price went down that Leiter raises supports Kling's assertion. When the price of something goes down, people consume more of it (and more of other goods as well) which tends to make them better off. Basic economic theory.

One of the update's is hilarious.

Again we see Leiter making the same mistake that he snottily accuses Arnold Kling of making. First he provides no data on the other issues, let alone color televisions. Second, while there is more to happiness than color TVs, but if the price of color TVs has gone down then that bit of evidence supports Arnold's position and weakens Leiter's position.

So we can see that Brian Leiter is very much like Michelle Malkin in that he only presents that evidence (when he presents evidence at all) that suits his world view. Personally I find it rather funny.

It shows no sign of changin

More evidence that the Democrats are becoming desperate? I don't know, but that is what it looks like to me. A drowning man clinging to anything he thinks will help keep him afloat just a bit longer.

This conspiracy thing is a bit much. I mean come on, the documents are such obvious forgeries did they really expect that CBS would be so sloppy with their vetting of the documents? A better job of authenticating the documents would have completely killed this story. In addition, lets nt forget that the Kerry campaign is part of this story. Ben Barnes after all is a big fund raiser for the Democrats. So this idea that the documents are a Rove dirty trick is just not credible.

On a side note, I love how this Rovian Trick has completely exonerated CBS in the eyes of some on the Left. CBS was tricked therefore they are not to blame. These morons seem to have forgotten it is CBS' job not to be tricked--i.e., they are to make sure what they are reporting on is legit.

And on the issue of the prices at these electronic markets being interpreted as probabilities, Prof. Bainbridge points to this Marginal Revolution post. The bottom line is that interpretting the prices as probabilities may be a bit more problematic based on this research by Charles Manski.

I'm not sure that Manksi is completely correct in his interpretation of his result. The idea is that we have a group of people who come up with subjective probability assessments of an event. Then they place bets based on their probability assessments either for or against. All we see is the market clearing price. Now, if I am reading Manski right (and I admit I may not be), the price that is settled on is in the initial distribution of probabilities. Hence it is still a probability, but based on Manski's numbers it isn't a "statistic" such as the mean, median, etc. Hence interpreting that probability in a meaningful way is problematic, IMO. Now that in and of itself is an interesting result.

Anyhow, Prof. Bainbridge has shot off a note to Alex Tabarrok asking for his input. While, I have little doubt I just bored many readers with this part of the update, I happen to find this stuff rather interesting.