Simple Interactive Statistical Analysis

Binomial

Input.

Input for the mean or average Expected box can be any positive (decimal) value larger than one. One can also give the proportion expected, which should be a real with a value between '0' (zero) and '1' (one). Input for the number Observed box must be an integer value, a whole positive number without decimals. Same for Sample size box, must be an integer value. For the zero truncated binomial check the box.

Explanation.

The binomial distribution is probably the best known of the discrete distributions and a discussion of its properties can be found in most introductory statistics books (Blalock HM, 1960; Wonnacott TH, Wonnacott RJ, 1977). The binomial distribution gives you the likelihood of finding 'x' failures (or white or female or large or tail or accidents, only your imagination limits you), as opposed to success (or black or male or small or heads or cars coming by which didn't have an accident). Your findings are the results of having done 'n' experiments, having made 'n' observations, or having studied a sample of size 'n'. You expected to find that an average of 'u' in your sample would have been failure, for example 'white' or 'large'. 'x', the number observed as positive, is what changes in the output box. 'u', the expected proportion of occurrences, is given in the top box, and 'n' , the sample size or number of experiments, is given in the third box.

Interpret the double and the single sided exact tests in the summary as follows. The double sided significance test according to the method of small p-values and the notation >= gives the exact probability of the difference between the expected and the observed value or any larger difference, considering the location of the expected and the observed value. The notation > relates to the probability of getting a larger difference than the observed difference between the observed and the expected value. The single sided test with notation >= gives the exact probability of getting the value observed or any larger value, considering the expected value. Similarly, the notation <= gives the probability of getting the observed or any smaller value; the notation > gives the probability of getting larger values than the observed value; the notation < gives the probability of getting smaller observed values than the one expected. Some exact statistical analist will take the smallest of the two one sided probabilities which include the point probability (<= and >=) as "the" one sided probability and calculate the value of the exact two sided probability as double the value of this one sided probability. This approach has the advantage that the one sided probability has no longer a direction (and the two sided probability is based on this). However, although mostly pretty close this procedure is conservative and does not seem to be based on anything exactly, SISA does not recommend this procedure.

The normal distribution, or z-distribution, is often used to approximate the binomial distribution and data for this statistic is also presented. However, if the sample size is very large the Poisson distribution is a philosophically more correct alternative for the Binomial distribution than the normal distribution. One of the main differences between the Poisson distribution and the binomial distribution is that in using the binomial distribution all eligible phenomena are studied, whereas in the Poisson only the cases with a particular outcome are studied. For example: in the binomial test all cars are studied to see whether they have had an accident or not, whereas in the Poisson test only the cars which have had an accidents are studied.

If you know the number of outcomes and the number of expected outcomes, and you would like to determine how likely it is that a particular size of a sample, or a particular number of experiments, would have produced this result, you can use the Negative Binomial (Version 2). In the case of samples drawn without replication from small populations the hypergeometric distribution should be used. The binomial test assumes that the expectation is error free, i.e. that it is a value known with certainty. Often it will be a theoretical or a population value. If the expected value is not error free it is better to construct a two by two table and to do an exact Fisher test or use a Chi-square test or t-test as approximation. These tests are implemented on the SISA website and in the SISA-Tables module for Windows.

A number of items based on the normal approximation of the binomial distribution are also presented. It is then possible to construct a confidence interval around the difference in proportions, or numbers, which is tested. If the confidence interval is small we are very sure that there is an important difference; if the confidence interval is large then we are not very sure. If the value zero is between the upper and the lower value of the confidence interval, this means that the difference between the two numbers is not statistically significant. The normal deviation, the standardized difference, can be used to estimate p-values as a direct alternative to using a binomial procedure. However, remember that exact tests are superior to normal approximations.

Zero Truncated Binomial.

The zero truncated Binomial distribution concerns a Binomial distribution without zeros. For example, you want to know if the number of different items people buy in a shop follows a Binomial distribution, then, when you check this at the cash register, you will not see any people with zero items. The upper limit of the number of different items people can choose from is in this example the sample size.

Note that in the zero truncated binomial distribution the relationship between the proportion and the distribution mean is different and considerably more complex than in the usual Binomial procedure. If you input the mean expected count (the average number of different items you expect in peoples shopping baskets) the program will echo the expected proportion, the variance and part of the distribution. The mean should have a value above one. The mean can also be entered as a proportion, a value between zero and one, for example, if you want to enter the proportion of available items people have on average in their shopping basket. The program will echo the mean and the rest. The proportion should have a value above zero.

Please study the binomial distribution further by using the Binomial spreadsheet

TOP of page

Go to procedure

Simple Interactive Statistical Analysis

Binomial

Input.

Explanation.

Zero Truncated Binomial.

All software and text copyright by SISA