Go to procedure

Statistical Power

Input.

In case of the usual two sample size calculation: A) one can give proportions in the top two boxes, positive numbers between 0 or 1, in which case one would mostly not give a standard deviation; or, B) one can give averages or means, any positive number, in which case one would mostly also give standard deviations. After having done one of the two give the number of cases of one of the two groups and the allocation ratio. The allocation ratio will be 'one' in most cases.

One sample analysis concerns testing an observed sample mean against an expected -invariant- population or historical mean: A) one can give proportions in the top two boxes, positive numbers between 0 or 1, usually no standard deviation is provided; or, B) give averages or means, any positive number, usually a standard deviations is provided. Input the invariant historical mean in the top 'exp' box and the sample mean in the 'obs' box. After having done all that give the number of cases of the sample. The allocation ratio is irrelevant in one sample analysis.

For equality analysis give at least one mean or average, the mean for the current situation, in the top box. If required give a second mean in the second box and a tolerance level in the third box. In case the mean is not a proportion, a standard deviation will normally be given in the fourth box. After having done all that give the number of cases of one of the two groups and the allocation ratio. The allocation ratio will be 'one' in most cases.

Pairwise sample size concerns two measurements on the same subjects. For proportions, give the two proportions, which are on the diagonal of changers in a two by two table, in the top two boxes. Give the number of cases in the number of cases box. The power for the McNemar is calculated. For mean or averages, give the sum of mean differences in the top box and the standard deviation of this mean in the third (Std.dev 1) box. Give the number of cases in the number of cases box. The power for the pairwise t-test is calculated.

For calculating the power of a difference between correlations go here.

Don't forget to input the number of cases!

Discussion

This procedure provides a basic method of calculating the statistical power of a test given a certain number of cases. Basically what the program does is to give you the probability of finding the difference between mean one and mean two statistically significant. This given a certain sample size and standard deviation. Standard deviations are not required if the means are proportions, decimal numbers between 'zero' and 'one'.  The procedures implemented here are only applicable for simple random samples and to compare two means.

Input is made as simple as possible. However, there is some flexibility which allows for special situations, based on the input consisting of proportions (A) or means (B). A) The program recognizes a proportion by there being a number between 'zero' and 'one' in the first box and a standard deviation of 'zero' in the third box, in which case the program expects to also find a proportion in the second box. A specific proportion related method will be used to estimate the statistical power. B) In the case of a value of 'one' or more in the first box a method will be used which considers that the values given are means. There are the following possibilities to give standard deviations related to the means: B.1) if a standard deviation is given in the first standard deviation box only, it is considered the correct standard deviation to do the power calculation; B.2) if two standard deviations are given the program uses these to calculate and apply the standard deviation of the difference between the two means, which will lead to a safe (lower) calculation of the statistical power; B.3) if no standard deviation is given the mean is considered to be a rate, i.e., a number of counts which is Poisson distributed. The program obtains the standard deviation of each of the two means by taking the square root of the mean.

If one or two standard deviations are given, the program will treat the given means always as a mean, even if one or both are between 'zero' and 'one'. This way the program can consider a value between 'zero' and 'one' in two ways, as a proportion, to be compared with another proportion using a special method, or as a mean, can be compared with any other possible value but requires a standard deviation provided by the user.

Allocation ratio

is an additional parameter. An allocation ratio of 'one' is used when you use two similar sized groups or samples. An allocation ratio of 'two' is used when one group is twice as large as the other group, the calculations are based on the latter group. You give the program the size for only one group or sample. The size of the other group or sample is the size given to the program multiplied with the allocation ratio. In case of unequal variances most power can be accomplished for the lowest number of cases by allocating more cases to the group with the highest variance. The program sometimes makes a suggestion on the optimum allocation ratio. The allocation ratio can be any positive value, with or without decimals. The allocation ratio is not relevant and functional for pairwise and population analysis.

Continuity correction

Continuity correction is not applied in the case of proportions. The power calculations relate to the Chi-square (see two by two tables), which is a test with a relatively low power. The program therefore tends to give a safe (low) estimate of the power for most statistical tests in proportions. However, don't push your luck, the Chi-square isn't that un-powerful, the power for tests such as the t-test and the Fisher will mostly be only a couple of percentage points above the value given by the program. In the case of continuous means given with a standard deviation continuity correction is applied. Again this leads to a slightly conservative estimate of the statistical power for the t-test, which would be mostly applied to compare for a difference between two continuous means.

Output

In the output the program gives you the required number of cases for the given alpha and power. Alpha is the chance that one would conclude one has discovered an effect or difference 'd', while in fact this difference or effect does not exist. Usually, alpha is set at 5%, which means that in 5%, or one in twenty, of projects the data signals that 'something' exists, while in fact it does not. The z-value of the power and the power itself are given in the table. The power is mostly set at 80%. This means that there is an 80% chance of finding the difference between mean one and mean two statistically significant. In the case of two sample analysis and equality analysis you get two numbers of cases, for each sample one, in the case of one sample analysis and pairwise analysis you get one number of cases.

Two numbers are given, one for double sided and one for single sided testing. Single sided is used when you know the direction of the effect, new treatment is better than old treatment, small cars are cheaper than big ones. Use double sided if you do not know the direction, which treatment has less complications, are males different from females, which type of car is faster, big one or small one? The program echoos the size of only one group. The size of the other group is the size given by the program times the allocation ratio. Double sided is more often used, because it gives a more conservative (low) estimate of the statistical power.

Population analysis.

Population analysis works exactly the same as the two sample analysis, the main difference being that the value of one parameter is not an estimate but is exactly known. There are two cases in which this is considered to be the case: a) the 'historical' situation were some sort of an arrived opinion on the numerical value of a phenomenon exists; b) in the case the numerical value is a population value, for example, the number of deaths in a community can be exactly known. In the case of 'a', exactly seems a relative concept and Bayesian methods might be preferred. In the case of 'b' the methods proposed here are valid. Fill-in the population proportion or mean in the top box, fill-in the postulated sample proportion or mean in second box. Input is further the same as described under the input heading and similar consideration as discussed above apply.

There are some subtle differences which we will discuss now. First, allocation ratio is not relevant and is not considered in the analysis. The program requires the size of the one sample which is used to test an estimate against a population value.

In the case of proportions it should be considered that the underlying nature of the data is quite different from data used to test for a difference between two estimated proportions. In the case of two estimated proportions the data consists of a two by two table and all methods for table analysis apply. In the case of comparing a population value with a sample estimate it concerns data which compares an expected with an observed distribution in a one dimensional array with two categories. In this case one would usually not use the Chi-square. The t-test also works differently. For the analysis the Binomial is the most appropriate test in this situation, use it in SISA online, use SISA's MsDOS version if you have a large sample size, or a Poisson approximation if you have a very large sample. Alternatively, but not preferred, you can use a normal approximation of the Binomial, consult Wonnacott and Wonnacott or Blalock on how to do this. Now, the problem is, that the formulae available for doing statistical power calculation for population analysis is in the case of proportions meant for the chi-square. The program therefore tends to give a safe (low) estimate of the power for the binomial and the normal approximation. However, don't push your luck, the Chi-square isn't that un-powerful, the power for the binomial and the normal approximation will mostly be only a couple of percentage points above the value given by the program.

For means or averages, give the expected population mean in the top -exp- box and the observed sample mean in the second -obs- box. If you use the population standard deviation give it in the Std.dev. 1 box (the calculation will be based on approximating the normal distribution). If you use the sample standard deviation give it in the Std.dev. 2 box (the calculation will be based on approximating the t-distribution). Give the number of cases in the Number of Cases box.

Equality analysis.

The above and more usual type of analysis considers that 'new situation' is probably different, and an improvement mostly, compared with 'old situation'. Say, however, one wants to lower the number of operators, or nurses, or make some other cost saving. In this case we could expect the change to result in a deterioration of outcomes, compared with the current situation. This leads to a different view on changes and differences, although it is up to the reader to decide to what extent the situations applies to them. In the classical model discussed above one wants to run relatively little risk to implement a possibly ineffective change or to make a lot of fuss about a non-existing difference, therefore one wants to not too easily discover a difference and alpha levels are set at a relatively low value (say 0.05) while a not too high power is considered acceptable (80%). In equality analysis the assumption is that one does want to discover a difference, as such a difference might mean bad news and one does not want to deny bad news. Alpha levels are therefore set at a high level (say 0.1) and a high power (95%) is also required.

The input is a bit complicated. Two means or averages can be given. One is the current outcome mean, for example the proportion cured or defective, or the mean score on a measurement or assessment scale. A second mean can be given if one already knows that the change will produce changes in outcome, and if one is of the opinion that such changes are acceptable. This is an exceptional situation, mostly the two means given will be the same, one expects no change in outcome.

A 'tolerance (delta)' parameter is subsequently given. One wants that Beta percent of the observations is within the tolerance level given. Beta is the powerlevel, the tolerance will mostly be set at quite a small value. The tolerance parameter is only relevant for equality analysis.

In case the means does not concern proportions, a standard deviation should be given in the 'Standard Deviation 2' box. The program considers that the standard deviation is the same for both means. One does not have to give a standard deviation in the case of proportions, keep the value in the Standard Deviation 2 box at 'zero'.

If a standard deviation is not given the program will calculate one for you, according to the methods set out for the usual sample size calculation methods discussed above.

Pairwise analysis.

Pairwise analysis is when you do two measurements on a single individual and then compare the outcome of the two measurements. Mostly a time factor is involved, a measurement is done, something "happens", an "intervention" for example, after which the measurement is done again. The before and after measurements are compared.

In the case of means or averages the score for each individual on the measurement before is substracted from the score on the measurement after the intervention. These differences are for all individuals added together producing a mean difference with an associated standard deviation. Your nill hypothesis is that the mean is zero, overall (in net terms) the respondents did not change. The power calculated is the power with which you detect a postulated net change over all individuals. Give the expected mean difference, or net change, for all individuals in the top box and the associated standard deviation in the third (Std.Dev. 1) box. Give the number of cases of the test in the Number of Cases box. The calculation will approximate the t-test. In the unlikely event you want to approximate the normal distribution give the standard deviation in the fourth, Std.Dev. 2 box instead of in the Std.Dev. 1 box.

For proportions the analysis concerns the numbers of people who change between two groups, here denoted as 'A' and 'B'. It is customary to study the changes in a crosstable, with the numbers of 'A' and 'B' before and after the intervention in the two marginals and the changing respondents inside the table, with a diagonal of non-changes (cells aa and bb) and a diagonal of changers (cells ab and ba). The program calculates the power for doing a McNemar. There are two strategies. The preferred strategy is to give the proportion of respondents on the total you expect to change from group 'A' to group 'B' (the proportion which are in cell ab on the total) in the top box and the proportion of people you expect to change from group 'B' to group 'A' in the second box. Give the number of cases in the Number of Cases box.

The second and much less prefered strategy is to give the proportion of people who where in group 'A' before the measurement in the top box and the proportion of people who where in groep 'A' after the measurement in the second box. It concerns the marginal change (which is in fact what we are interested in). Make the allocation ratio more than 50 to do this analysis. The program then estimates the numbers of changers inside the table from the marginals, independence is presumed. Lastly, give the number of cases in the Number of Cases box.

The allocation ratio is not relevant for pairwise comparisons, you have only one group, the group the program gives you in the output field.