Bayes factor
From Academic Kids

In statistics, the use of Bayes factors is a Bayesian alternative to classical hypothesis testing.
Given a model selection problem in which we have to choose between two models M_{1 and M2, on the basis of a data vector x. The Bayes factor K is given by }
 <math>K = \frac{p(xM_1)}{p(xM_2)}.<math>
This is similar to a likelihoodratio test, but instead of maximising the likelihood Bayesians average it over the parameters. Generally, the models M_{1 and M2 will be parametrised by vectors of parameters θ1 and θ2; thus K is given by }
 <math>K = \frac{p(xM_1)}{p(xM_2)} = \frac{\int \,p(\theta_1M_1)p(x\theta_1, M_1)d\theta_1}{\int \,p(\theta_2M_2)p(x\theta_2, M_2)d\theta_2}.<math>
A value of K > 1 means that the data indicate that M_{1} is more likely than M_{2} and vice versa. Note that classical hypothesis testing gives one hypothesis (or model) preferred status (the 'null hypothesis'), and only considers evidence against it. Harold Jeffreys gave a scale for interpretation of K:
K  Strength of evidence 

< 1  Negative (supports M_{2}) 
1 to 3  Barely worth mentioning 
3 to 12  Positive 
12 to 150  Strong 
> 150  Very strong 
Many Bayesian statisticians would use a Bayes factor as part of making a choice, but would also combine it with their estimates of the prior probability of each of the models and the loss functions associated with making the wrong choice.
Example
Suppose we have a random variable which produces either a success or a failure. We want to consider a model M_{1} where the probability of success is q=½, and another model M_{2} where q is completely unknown and we take a prior distribution for q which is uniform on [0,1]. We take a sample of 200, and find 115 success and 85 failures. The likelihood is:
 <math>{200 \choose 115}q^{115}(1q)^{85}<math>
So we have
 <math>P(X=115M_1)={200 \choose 115}\left({1 \over 2}\right)^{200}=0.00595...<math>
but
 <math>P(X=115M_2)=\int_{q=0}^1 1{200 \choose 115}q^{115}(1q)^{85}dq = {1 \over 201} = 0.00497...<math>
The ratio is then 1.197..., which is "barely worth mentioning" even if it points very slightly towards M_{1}.
This is not the same as a classical likelihood ratio test, which would have found the maximum likelihood estimate for q, namely ^{115}⁄_{200}=0.575, and from that get a ratio of 0.1045..., and so pointing towards M_{2}. A frequentist hypothesis test would have produced an even more dramatic result, saying that that M_{1} could be rejected at the 5% confidence level, since the probability of getting 115 or more successes from a sample of 200 if q=½ is 0.0200..., and as a twotailed test of getting a figure as extreme as or more extreme than 115 is 0.0400... Note that 115 is more than two standard deviations away from 100.
M_{2} is a more complex model than M_{1} because it has a free parameter which allows it to model the data more closely. The ability of Bayes factors to take this into account is a reason why Bayesian inference has been put forward as a theoretical justification for and generalisation of Occam's razor, reducing Type I errors.
See also