**Simple Interactive Statistical Analysis**

Input.

Input for the mean or average __Expected__ box can be any positive (decimal) value larger. One can also give the proportion expected, which should be a real with a value between '0' (zero) and '1' (one).
Input for the number __Observed__ box must be an integer value, a whole
positive number without decimals. Same for __Sample size__ box, must be an integer value.

Invert. To find the expectation which produces a certain cumulative probability value given the observed number and the total number of cases. Give a probability value, a value between 0 and 1 in the top box. Put the observed number and the total number of cases in the appropriate boxes, both are integers.

Explanation.

The binomial distribution is probably the best known of the discrete distributions and a discussion of its properties can be found in most introductory statistics books (Blalock HM, 1960; Wonnacott TH, Wonnacott RJ, 1977). The binomial distribution gives you the likelihood of finding 'x' failures (or white or female or large or tail or accidents, only your imagination limits you), as opposed to success (or black or male or small or heads or cars coming by which didn't have an accident). Your findings are the results of having done 'n' experiments, having made 'n' observations, or having studied a sample of size 'n'. You expected to find that an average of 'u' in your sample would have been failure, for example 'white' or 'large'. 'x', the number observed as positive, is what changes in the output box. 'u', the expected proportion of occurrences, is given in the top box, and 'n' , the sample size or number of experiments, is given in the third box.

Interpret the double and the single sided exact tests in the summary as follows. The double sided significance test according to the method of small p-values and the notation >= gives the exact probability of the difference between the expected and the observed value or any larger difference, considering the location of the expected and the observed value. The notation > relates to the probability of getting a larger difference than the observed difference between the observed and the expected value. The single sided test with notation >= gives the exact probability of getting the value observed or any larger value, considering the expected value. Similarly, the notation <= gives the probability of getting the observed or any smaller value; the notation > gives the probability of getting larger values than the observed value; the notation < gives the probability of getting smaller observed values than the one expected. Some exact statistical analist will take the smallest of the two one sided probabilities which include the point probability (<= and >=) as "the" one sided probability and calculate the value of the exact two sided probability as double the value of this one sided probability. This approach has the advantage that the one sided probability has no longer a direction (and the two sided probability is based on this). However, although mostly pretty close this procedure is conservative and does not seem to be based on anything exactly, SISA does not recommend this procedure.

The normal distribution, or z-distribution, is often used to approximate the binomial distribution and data for this statistic is also presented. However, if the sample size is very large the Poisson distribution is a philosophically more correct alternative for the Binomial distribution than the normal distribution. One of the main differences between the Poisson distribution and the binomial distribution is that in using the binomial distribution all eligible phenomena are studied, whereas in the Poisson only the cases with a particular outcome are studied. For example: in the binomial test all cars are studied to see whether they have had an accident or not, whereas in the Poisson test only the cars which have had an accidents are studied.

If you know the number of outcomes and the number of expected outcomes, and you would like to determine how likely it is that a particular size of a sample, or a particular number of experiments, would have produced this result, you can use the Negative Binomial (Version 2). In the case of samples drawn without replication from small populations the hypergeometric distribution should be used. The binomial test assumes that the expectation is error free, i.e. that it is a value known with certainty. Often it will be a theoretical or a population value. If the expected value is not error free it is better to construct a two by two table and to do an exact Fisher test or use a Chi-square test or t-test as approximation. These tests are implemented on the SISA website and in the SISA-Tables module for Windows.

A number of items based on the normal approximation of the binomial distribution are also presented. It is then possible to construct a confidence interval around the difference in proportions, or numbers, which is tested. If the confidence interval is small we are very sure that there is an important difference; if the confidence interval is large then we are not very sure. If the value zero is between the upper and the lower value of the confidence interval, this means that the difference between the two numbers is not statistically significant. The normal deviation, the standardized difference, can be used to estimate p-values as a direct alternative to using a binomial procedure. However, remember that exact tests are superior to normal approximations.

Please study the binomial distribution further by using the Binomial spreadsheet