  Tables

Download
Agreement Tests
Chi-Squares
Confidence Interval
Edit Menu
Exact Test
Fisher's Exact Test
Help
Input Box
Intra correlation
Kappa
Kolmogorov-Smirnov
Lambda
License
Likelihood Ratio Chi-Square
Limitations
Mantel-Haenszel Chi-Square
Median Test
Number Needed to Treat (NNT)
Odds Ratio
Options
Ordinal Association
Ordinal Exact
Ordinal Tests
Output Field
Pearson's Chi-Square
Risk Ratio
T-test
T-test Proportions
Yate's Chi-Square
Yule's Q
Yule's Y

TOP of page

Intracorrelation/Kappa

This module does intra-correlation calculations for dichotomous or binary yes/no type outcome variables according to the method proposed by Fleiss (1982) and including an addition to consider multi-level effects as proposed by Donner & Klar (1994). The intra-correlation coefficient is used for a number of purposes, two of which are discussed here. First the intra-correlation coefficient is used as an alternative to the kappa measure of agreement in case of multiple raters or judges considering the presence or absence of a trait on a larger number of items (objects) or individuals (subjects). High kappa means that there is a high level of agreement between the judges with regard to the presence of the trait. In the table each row pertains to the scoring of one judge, the first column shows the number of individuals having been scored 'positive', the second column the number of individuals having been scored 'negative'.

Second the intra-correlation is used as a measure of clustering in data which is collected using a multistage, multilevel, procedure of data collection. This is the case if, for example, a number of schools are selected randomly and in each school a number of pupils is sampled for study. The analysis is taking place at the pupil level. If there is intra-correlation the standard errors of statistics at the pupil level are estimated too narrow, tests become statistically significant too quickly and confidence intervals will be too small. The intra-correlation coefficient gives you a view as to what extent the observations at the individual level are influenced by clustering of observations in higher level groups, which is the case if, for example, schools are particularly good, and others particularly bad, at the trait measured with the dependent variable. If the intra-correlation coefficient equals 0 the schools are not different with regard to the independent variable, all the pupils could have been sampled from a single school and the result of the analysis would have been the same. The number of pupils is the correct sample size. If the intra-correlation coefficient equals 1 the schools are totally different and pupil performance is totally influenced by the school, effective sample size is the number of schools, and not the number of pupils. The method by Fleiss can be seen as the case of a single proportion, the proportion of pupils that passed a test. Each row in the table pertains to the result of one school, the first column shows the number of pupils having been scored 'positive' the second column the number of pupils having been scored 'negative'. It should be noted that the fast developing sample-resample, bootstrapping, methodology is a powerful, but unfortunately still complex, alternative to the use of the intra-correlation coefficient in clustered data.

The procedure calculates the intra-correlation coefficient, the significance of the intra-correlation coefficient, the standard error for the overall proportion when there is no intra-correlation, the standard error when there is full intra-correlation and the corrected standard error given the observed level of intra-correlation. On the basis of these three standard errors three confidence intervals are calculated.

Fleiss JL. Statistical methods for rates and proportions, 2nd edition. New York [etc.]: John Wiley 1982.

Donner A, Klar N. Methods for comparing event rates in intervention studies when the unit of allocation is a cluster. American Journal of Epidemiology 1994;140(3):279-289

TOP of page

Fisher

For two by two table

The Fisher's Exact procedure calculates an exact probability value for the relationship between two dichotomous variables, as found in a two by two table. The program calculates the difference between the data observed and the data expected, considering the given marginal and the assumptions of the model of independence. It works in exactly the same way as the Chi-square test for independence; however, the Chi-square gives only an estimate of the true probability value, an estimate which might not be very accurate if the marginal is very uneven or if there is a small value (less then five) in one of the cells. In such cases the Fisher is a better choice than the Chi-square. However, in many cases the Chi-square is preferred because the Fisher is difficult to calculate.

The one-sided probability for the Fisher is calculated by generating all tables that are more extreme than the table given by the user, in one direction. The p-values of these tables are added up, including the p-value of the table itself. The single-sided p-value is the summed probability of all more extreme or similar tables compared with the given table (notation p(Observed >=Expected )). There is also a p-value of the relationship going in the other direction. This is calculated by taking all the tables which are less extreme or more extreme in the opposite direction (notation: O<=E). Do not take too much notice of this p-value; it is not so important. However, it is mentioned in the textbooks and SISA does not want to deprive you of this information

There are a number of theories about how to present double-sided p-values (Agresti, 1992). Data on the basis of two of these theories are presented. First, the sum of small p-values. For the sum of small p-values all tables are generated which are possible given the margins. All p-values of the same size or smaller than the point probability are added up to form the cumulative p-value. The result is relevant to the notation p(O>=E |O<=E). Statisticians usually recommend this method. Another method of estimating the double-sided p-value is to take twice the single-sided probability. The notation for this method is the same as for the method of small p-values. Simulations show that the p-value for the method of small p-values not including the point probability {p(O>E|O<E)} is often closer to the Chi-square. You can compare the relevant p-values and the Chi-squares yourself. Lastly, note that the difference between p(O>=E |O<=E) and p(O>E|O<E) and the problem of which p-value to compare with the Chi-square is dependent on the amount of continuity inherent in the table. The Chi-squ'are is based on a continuous distribution while in fact the p-values jump in value between tables. This jumping is dependent on the number of different tables that can be produced given the margins. The number of tables, and therefore the level of continuity, is not dependent on the total sample size but on the number of cases in the smallest marginal.

See also: Median Test

Agresti A. A Survey of Exact Inference for Contingency Tables. Statistical Science 1992;7(1):131-177.

TOP of page

Median Test

For two by two table

Median Test. One particular application of the Fisher is as a test for the difference in location between two medians (the median is the point or value above which we can find exactly half of the observations). The null hypothesis is that the medians for two groups are the same, the alternative is that the locations are different. Fill in the number of scores for group one above the combined median (the one for both groups taken together), then do the same for group two. See the example. The Fisher probability values are valid for the median test (in this case 0.003047 single-sided or 0.006095 double-sided). Group 1 is statistically significantly different from Group 2. Note: use double-sided testing if there was no prior expectation as to which of the two groups had a higher median.

 example data for median test Group 1 Group 2 Combined No. of scores above combined median No. of scores below combined median 9 21 24 12 33 33

TOP of page

Exact Test

For three by three table

For seven by two table

The exact test is a generalization of the Fisher procedure and calculates an exact probability value for the relationship between two variables in larger tables. The procedure will handle smaller tables too and is similar to the Fisher in a 2*2 table (try it!). The program calculates the difference between the data observed and the data expected, considering the given marginals and the assumptions of the model of independence. It works in exactly the same way as the Chi-square test for independence; however, the Chi-square gives only an estimate of the true probability value, an estimate which might not be very accurate if the marginal is very uneven or if there is a small value (less then five) in one of the cells. In such cases the exact test is a better choice than the Chi-square. However, in many cases the Chi-square is preferred because the exact test is difficult to calculate.

Two statistics are offered. The point probability of this unique single table, and the two-sided probability. For the two-sided probability the sums of small p-values method is used. All possible tables given the marginals are evaluated and the p-values of the more extreme tables are added up. See the Fisher helpfile for a more detailed discussion.

The procedure is based on the evaluation of many thousands of tables. The number of tables to evaluate increases linearly with the number of cases and exponentially with the number of cells. Tables in which the number of cases is larger than 4000 in a 2*3 table, 2000 in a 2*4, 1000 in a 2*5 table, 500 in a 2*6 and 100 in a 2*7 table may take longer. The 7th row in the 2*7 table is marked red, full table only to be used in case of emergency. However, you can input large numbers of cases and you will get an answer, eventually. Also, the number of cases can be very much larger if the table has a very skewed marginal distribution. Agresti (1992) discusses a 2*5 table with 32,574 cases which SISA tables will work out in a couple of minutes. The maximum number of tables to evaluate is set at 2,147,483,646, which would take about 30 hours to do.

Smaller tables can be used in the 2*7 and the 3*3 modules. The 2*7 module is more efficient for evaluating a 2*3 table than the 3*3 module.

It should be noted that statistical programs such as SPSS now have a very good semi-exact procedure that will work on any size table and only takes a couple of minutes to do. And it is a very good procedure indeed.

Agresti A. A Survey of Exact Inference for Contingency Tables. Statistical Science 1992;7(1):131-177.

TOP of page

Confidence Interval

For all three tables

Most researchers and statisticians set the Confidence Interval for their project at 95%. This means that in 95% of research projects such as yours the sample mean of the parameter (which may be a true mean, an observed number, or an odds ratio) will be within the stated interval. The alpha error is 5%, basically the probability that the mean will not be within the stated interval on the basis of chance. 5% is then the chance of a type I error, declaring that a difference exists, that a medicine is effective, or a change profitable, while in fact the apparent difference is due to the fact that the sample is unrepresentative.

The Confidence Interval can be changed using the submenu CI' under the Option's menu. Fixed options are for 80, 90, 95 and 99% Confidence Intervals. You can define any other confidence interval by using the other' option. Confidence intervals smaller than 1% or greater than 99.9998% are not accepted. The default setting for the confidence interval is 95%.

TOP of page

Edit Menu

The Edit menu is operational for the output field and for the input boxes. The usual shortcuts to the clipboard can be used: Ctrl-C=Copy to clipboard; Ctrl-V=Paste from clipboard and Ctrl-X=Cut and copy to clipboard. Ctrl-Z=Undo works only for the output field..

TOP of page

Ordinal Tests

These tests are to determine whether there is an ordinal association between two variables answering the question if the number of observations increase in the higher values of the columns as one reaches the higher values of the rows. Ordinal variables consist of ordered categories without quantitative differences. For example, length is a quantitative variable: 2 meters is 4 times as long as fifty centimeters. However, religious strictness, ordered in the categories (1) progressive, (2) moderate and (3) conservative is an ordinal variable. Conservative' does not mean three times as strict as progressive, only stricter'. One of the most frequently used ordinal variables is the Likert-scale, in which people are asked to order their opinions from very much agree' to very much disagree'">, on a five or seven point scale.

The lowest category box, the 1,1 box, is the upper left box in the table. The highest score box is the bottom right box.

SISA first presents an account of the numbers of pairs in the table. The different pairs form the basis of many analyses of ordinal association. Concordant pairs consist of individuals paired with other individuals who score both lower on the column and lower on the row variable. Discordant pairs consist of individuals paired with other individuals who are lower on the one, and higher on the other variable. Tied pairs are individuals paired with others who have the same score on either the rows or the columns.

The following selections can be made:

Ordinal Exact (A Fisher like test of ordinal association).

Ordinal Association (includes Kendall's Tau-a and Goodman and Kruskal's Gamma).

There are the following additional tests of ordinal association:

T-test (to test for a difference in mean between the columns).

Kolmogorov-Smirnov. (To see if the columns are from the same or a different ordering).

TOP of page

T-test

For seven by two table

The t-test tells you how probable it is that a difference between two means has resulted from chance fluctuation. The t-test is considered not valid for data of an ordinal or dichotomous nature. However, the test is very often used in this type of data and is considered to produce conservative results if the assumption of normally distributed continuous variables is not met. Thus, if the t-test is significant in ordinal data then it is very likely that there is a significant difference between two means; if the t-test is not significant you do not know for sure if this lack of a statistically significant difference is a valid observation.

The procedure also gives you some statistics related to the two columns.

TOP of page

Kolmogorov-Smirnov

For seven by two table

The Kolmogorov-Smirnov Test is not a test of association. This test gives the likelihood of two orders coming from different orderings or the same ordering. Have a look at this table:

Do you agree or disagree with the following statement (proportions between brackets)
[cumulative proportions between square brackets]

 Males Females Difference Totally agree 10 (0.12)[0.12] 24 (0.26)[0.26] 14 (0.14)[0.14] Agree 15 (0.18)[0.30] 15 (0.17)[0.43] 0 (0.01)[0.13] Neither agree or disagree 19 (0.23)[0.53] 21 (0.23)[0.66] 2 (0.00)[0.13] Disagree 18 (0.21)[0.74] 17 (0.19)[0.85] 1 (0.02)[0.10] Totally disagree 22 (0.26)[1.00] 14 (0.15)[1.00] 8 (0.11)[0.00] Total 84 (1.00) 91 (1.00) 7 (0.00)

The K-S test assesses if the largest proportional cumulative difference in a table has been caused by chance fluctuation or not. In this case this difference equals [0.14] (top right cell). The program echoes the Chi-square value of the expected largest proportional difference, (Chi-2= 3.673) and the p-value of the difference between the observed and the expected largest difference, with two degrees of freedom. The p-value in this example equals 0.15933, the difference in ordering between males and females may well have been caused by chance fluctuation.

The probability value presented is single-sided. The literature considers that the Kolmogorov Smirnov test has very little power with a high chance of a type II error, i.e. of not finding a difference when there is one. Unless there are serious theoretical or other reasons for using the K-S, use of the gamma is preferable.

TOP of page

Risk Ratio

For two by two table

The risk ratio takes on values between zero ('0') and infinity. One ('1') is the neutral value and means that there is no difference between the groups compared, close to zero or infinity means a large difference between the two groups on the variable concerned. A risk ratio larger than one means that group one has a larger proportion than group two; if the opposite is true the risk ratio will be smaller than one. If you swap the two proportions, the risk ratio will take on its inverse (1/RR).

The risk ratio gives you the percentage difference in classification between group one and group two. For example, the proportion of people suffering from complications after traditional surgery equals 0.10 (10%), while the proportion suffering from complications after alternative surgery equals 0.125 (12.5%). The risk ratio equals 0.8 (0.1/0.125); 20% ((1-0.8)*100) fewer patients treated by the traditional method suffer from complications. Another example: 8% of freezers produced without quality control have paint scratches. This percentage is reduced to 5% if quality control is introduced. The risk ratio equals 1.6 (8/5); 60% more freezers are damaged if there is no quality control.

The risk ratio can be compared with the Odds Ratio. The risk ratio is easier to interpret than the odds ratio. However, in practice the odds ratio is used more often. This has to do with the fact that the odds ratio is closely related to frequently used statistical techniques such as logistic regression. Also, the odds ratio has the attractive property that, however you turn the table, it will always take on the same value or its reciprocal (1/odds ratio).

TOP of page

Odds Ratio

For two by two table

The odds ratio takes values between zero ('0') and infinity. One ('1') is the neutral value and means that there is no difference between the groups compared; close to zero or infinity means a large difference. An odds ratio larger than one means that group one has a larger proportion than group two, if the opposite is true the odds ratio will be smaller than one. If you swap the two proportions, the odds ratio will take on its inverse (1/OR)

The odds ratio gives the ratio of the odds of suffering some fate. The odds themselves are also a ratio. To explain this we will take the example of traditional versus alternative surgery. If 10% of operations results in complications, then the odds of having complications if traditional surgery is used equals 0.11 (0.1/0.9, you have a 0.11 times higher chance of getting complications than of not getting complications). 12.5% of the operations using the alternative method result in complications, giving odds of 0.143 (0.125/0.875). The odds ratio equals 0.778 (0.11/0.143). You have a 0.778 times higher chance of getting complications than of not getting complications, in traditional as compared with alternative surgery. The inverse of the odds ratio equals 1.286. You have a 1.286 times higher chance of getting complications than of not getting complications, in alternative as compared with traditional surgery. This takes some getting used to, we admit, but it has its advantages.

The odds ratio can be compared with the Risk Ratio. The risk ratio is easier to interpret than the odds ratio. However, in practice the odds ratio is used more often. This has to do with the fact that the odds ratio is more closely related to frequently used statistical techniques such as logistic regression. Also, the odds ratio has the attractive property that, however you turn the table, it will always take on the same value or the inverse (1/odds) of that value.

TOP of page

Number Needed to Treat (NNT)

For two by two table

Number Needed to Treat (NNT) is a measure which is becoming increasingly popular in the medical field. This measure is the reciprocal of the absolute risk-difference (ard=|proportion1-proportion2|) and expresses the number of persons to be treated to 'cure' one person.

The measure has some very appealing properties in interpretation, particularly in combination with cost calculation. An example: if no treatment is given 20% die, with treatment 15% die. NNT=20 (1/|0.2-0.15|). We need to treat 20 people to save one life. But now we develop a preventive program in a completely different area of health care and succeed in bringing the mortality down from 45% to 44.5%. NNT=200 (1/|0.45-0.445|). We need to apply our preventive program to at least 200 people to save one life. This does not seem very effective compared with treatment.

However, the cost of treatment is \$200 per person, prevention costs \$10 per person. The cost per life saved equals \$4000 (20*200) for treatment against \$2000 (200*10) for prevention. Prevention is highly cost effective and given a limited budget it should get precedence over treatment.

This way one can do quite a number of nice comparisons. A paper by Schulzer and Mancini gives some examples. In the program, confidence intervals for the NNT are calculated according to two methods. First the method suggested by Schulzer and Mancini, based on the Geometric distribution, and second the method suggested by Cook and Sacket, based on inverting the confidence interval of the difference between two means. The method suggested by Cook and Sacket is more often used in practice. The method suggested by Schulzer and Mancini is rather more interesting theoretically. Please note that whatever method is used, confidence intervals for the NNT are nonsensical if the difference between the two means is not statistically significant, i.e., if the probability of the t-value is more than 0.025 (in the case of a 95% confidence interval). The confidence interval for the NNT should NEVER be used for hypothesis testing. It is there for your information only. Use the t-test for hypothesis testing.

Cook RJ, Sackett DL. The number needed to treat: a clinically useful measure of treatment effect. British Medical Journal 1995; 310:452-454.

Schultzer M, Mancini GBJ. 'Unqualified success' and 'unmitigated failure': Numbers-Needed-to-Treat-Related concepts for assessing treatment efficacy in the presence of treatment induced adverse effects. International Journal of Epidemiology, 1996; 25(4):704-712.

TOP of page

Output Field

In the output field the output of the analysis is presented. The statistics in the output are related to the first current table' above. Each time you change between procedures or change the input a new current table' is printed in the output field.

The Output Field is fully editable. The Edit menu in the task bar is operational for this field and the usual short cuts to the clipboard can be used: Ctrl-C=Copy to clipboard; Ctrl-V=Paste from clipboard; Ctrl-X=copy & cut, Ctrl-Z=Undo.

The content of the output field can be printed and/or saved as a text file using the file menu in the task bar.

The size of the lettering in the output field can be set under Options.

TOP of page

Yule's Q

For two by two table

Yule's Q is based on the odds ratio and a symmetric measure taking on values between -1 and +1. 1 (one) implies perfect negative or positive association, 0 (zero) no association. In two by two tables Yule's Q is equal to Goodman and Kruskal's Gamma.

TOP of page

Yule's Y

For two by two table

Yule's Y is based on the odds ratio and a symmetric measure taking on values between -1 and +1. 1 (one) implies perfect negative or positive association, 0 (zero) no association. The measure tends to estimate associations more conservatively than Yule's Q. The measure has little substantive or theoretical meaning.

TOP of page

Likelihood Ratio Chi-Square

For all tables

The Likelihood Ratio Chi-square (LRX) was developed more recently than the Pearson chi-square and is the second most frequently used Chi-square. It is directly related to log-linear analysis and logistic regression. The LRX has the important property that an LRX with more than one degree of freedom can be partialised into a number of smaller tables each with its own (smaller) LRX and (lower numbers of) degrees of freedom. The sum of the partial LRXs and associated partial degrees of freedom, as found in the smaller tables, equals the original LRX and original number of degrees of freedom.

Obtained by checking the Chi-square box in all table procedures.

TOP of page

Pearson's Chi-Square

For all tables

Pearson's Goodness-of-Fit Chi-square (GFX) is most often used in research. Pearson's Chi-square is mathematically related to the classical Pearson's Correlation co-efficient and to Analysis of Variance.

Obtained by checking the Chi-square box in all table procedures.

TOP of page

Yates' Chi-Square

For two by two table

Yates' Chi-square is equivalent to Pearson's Chi-square with continuity correction.

Obtained by checking the Chi-square box in two by two table.

TOP of page

Mantel-Haenszel Chi-Square

For two by two table

Mantel-Haenszel Chi-square is thought to be closer to the 'true' Chi-square if small numbers of cases are involved. It is not often used. If you have doubts about your results, use an exact estimate of probabilities instead.

Obtained by checking the Chi-square box in two by two table.

TOP of page

Input box

The input boxes are the fill-in boxes that form a table on the left of the form. They assume integer input: whole positive numbers. The boxes should only contain numbers: no commas, decimal points or any other non-numeric characters. There is one exception: the letter E for a mantissa.

In the table thus formed the 1,1 cell is in the upper left corner. The main diagonal' is colored yellow and runs from the upper left to the lower right.

The total number you can input is: 2,147,483,646 for most analyses and 1,000,000 for exact analyses. However, if you input such a large number, you should not expect with certainty to get a meaningful output. See the discussion on the Limitations.

The red boxes are colored red to warn you of the fact that if you specify a full table an exact analysis might take a very long time.

Empty rows are ignored in most statistical procedures and a reduced table is analyzed. Empty zero- cells are not ignored if they are located beside a valid cell.

The input boxes support editing operations such Ctrl-C=Copy, Ctrl-V=Paste and Ctrl-X=Cut-and-Copy. Beware of leading or trailing spaces and other characters when pasting data into an input box.

TOP of page

Lambda

Goodman and Kruskal's Lambda is an example of a Proportional Reduction in Error (PRE) measure. PRE measures work by taking the ratio of: 1) an error score in predicting someone's most likely position (in a table) using relatively little information; with: 2) the error score after collecting more information. In the case of Lambda we compare the error made when we only have knowledge of the marginal with the reduced error after we have collected information regarding the inside of the table. Two Lambda's are produced. First, how much better are we able to predict someone's position on the row marginal if we also know the distribution of individuals in the table compared with only having knowledge of the row marginal, or in causal terms, how likely is it that the column scores cause' the row scores via the inside of the table (Lambda A). Second, how much better are we able to predict someone'sN>s position on the column marginal if we also know the distribution of individuals in the table compared wi'sth only having knowledge of the column marg''inal, or in causal terms, how likely is it that the row scores cause the column scores via the inside of the table (Lambda B). The program gives the proportional improvement in predicting someones score after collecting additional information.

TOP of page

Options

Four options can be set.

Change the Confidence Interval for all appropriate statistical procedures by choosing CI in the options menu and then ticking the level you want to select or use the fill-in box under the other' option. The default setting of the confidence interval is 95%. Confidence intervals smaller than one percent or greater than 99.9998 percent are not allowed.

Change the size of the letters in the output field by selecting Output and Font Size. A box will pop up and ask you to give the size you want in points. 99 points is the maximum. The default font size is 10.

Set the StopExactAt option to limit the number of tables generated for an exact procedure. The program will generate the number of tables up to approximately the number you request and ask if you want to continue or stop. Set to zero if you do not want to be given the choice to stop. The default is zero. This option works only for tables that are larger than 2 by 2. The number of tables generated for a 2 by 2 table is equal to the number of cases in the smallest marginal, so you know beforehand how many tables are generated.

Rounding. This option is only relevant for the (Fisher) exact procedures and the non-technically minded should accept the defaults. The option does not pertain to approximate procedures or the ordinal or agreement exact procedures. The option allows for the treatment of rounding errors. Somewhere in the exact procedures it is determined if the p-value of a generated table is smaller than the point probably of the table by the statement: if p-this table<=point-p+theta then {add p-this table to cumulated p-value's}. theta is a parameter to catch rounding errors. Should you not include theta you run the risk that two tables with the same p-value are not seen as such because the left p-value might have been upwardly rounded or the right p-value downwardly. This rounding error is a general Windows problem and takes place in the 18th significant digit. It might also affect the 17th or 16th significant digit. Auto sets theta to point probability * 1E-15. Auto is the default setting. Off sets theta to 0. Manual allows you to specify a value for theta yourself. You would normally choose a value smaller than 1E-15.

This version of tables does not have an INI file and settings are not remembered. You will have to set the options each time you use the program.

TOP of page

Ordinal Exact

For seven by two and three by three table

If you tick the Ordinal Exact box you get a p-value for the likelihood that there are tables with a higher number of concordant pairs than the number in the observed table. This is an exact test of ordinal association. This procedure is another derivative of the Fisher and works in the same way, by generating tables and estimating their likelihood. In a 2 by 2 table the cumulative ordinal exact value is equivalent to the single-sided Fisher.

The procedure is based on the evaluation of thousands of tables to check how many concordant pairs they have. The number of tables to evaluate increases more or less linearly with the number of cases and exponentially with the number of cells. Tables in which the number of cases is larger than 3000 in a 2*3 table, 1500 in a 2*4, 750 in a 2*5 table, 300 in a 2*6 and 75 in a 2*7 table may take longer. The last entries in the 2*7 table are marked red, only to be used in case of emergency. However, you can input any number of cases in any number of cells, and you will get an answer, eventually. The maximum number of tables to evaluate is set at 2,147,483,646, which would take about 45 hours on a P100 computer. Note: because the ordinal integrity of the table must be maintained in the calculations the ordinal exact procedure is less efficient than the basic exact procedure.

Smaller tables can be used in the 2*7 and the 3*3 modules. The 2*7 module is more efficient than the 3*3 module for evaluating a 2*3 table.

TOP of page

Ordinal Association

For seven by two and three by three table

The selection Ordinal Association' will give you Kendall's Tau-a and Goodman and Kruskal's Gamma for testing ordinal associations, with their respective sample standard deviations and p-values. Tau-a is the difference between the number of concordant and discordant pairs divided by the total number of pairs; Gamma is the difference between the number of concordant and discordant pairs divided by the sum of concordant and discordant pairs. Gamma usually gives a higher value than Tau and is (for other reasons as well) usually considered to be a more satisfactory measure of ordinal association. The p-values are supposed to approach the exact p-value for an ordinal association asymptotically, and the program shows that they generally do that reasonably well. But, beware of small numbers: the p-values for the gamma and Tau become too optimistic!

The Gamma is calculated in a two by two table by way of the Yule's Q.

Testing the program showed that calculation of the standard errors of Gamma and Tau (to an even higher degree) is cumbersome if large numbers of cases are involved. If no p-value is given for the Tau, the p-value for the Gamma is also valid for the Tau. If calculations are impossible, the NaN result is given.

TOP of page

Chi-Squares

For all tables

In all tables the Chi-square test tests for the model of independence between the row and the column marginal. They ask the question if knowledge of the row variable predicts someone's score on the column variable, and vice versa. For example, does someone's sex 'predict' his or her income, or does someone's schooling 'predict' his or her occupation? Independence is the situation that sex does not predict income, and that schooling does not predict occupation.

The following chi-square tests are available:

Pearson's Goodness-of-Fit (in all tables) (preferred)

Maximum Likelihood (in all tables) (second best)

Mantel-Haenszel(in 2 by 2 table) (not often used)

Yates (in 2 by 2 table) (Pearson with continuity correction)

Empty rows are ignored and the table is reduced before analysis. An empty  zero - cell is not ignored if it is located beside a valid cell.

TOP of page

T-test Proportions

For two by two tables

In this module two proportions/percentages with their respective standard deviations and standard errors are at the center of attention. The two proportions concerned are first, the proportion of cases in column one which is located in row one of a two by two table, and second, the proportion of cases in column two which is located in row one. These same two proportions are at the center of interest in the NNT procedure.

Following some statistics about the two proportions the difference between the proportions is calculated, the standard error of this difference and the Confidence Interval of the difference. In the case of proportional data, these statistics can be obtained directly from the proportions themselves.

Lastly, the t-test is presented. The t-test gives the probability that the difference between the proportions has been caused by chance fluctuation. The t-test given is one-sided, in other words, it is assumed that before doing the test you had a hypothesis that one proportion of the two proportions was bigger than the other proportion. If you did not have such a prior hypothesis, and you only aim to test for a possible difference between the proportions, you need to do a double-sided test; in this case you should multiply the p-value by two.

The t-test is basically not valid for testing the difference between two proportions. However, the t-test in proportions has been extensively studied, has been found to be robust, and is widely and successfully used in proportional data. With one exception: if one of the proportions is very close to zero, one or minus one, you will do better with Fisher's exact test.

TOP of page

Help

Help is provided using the manual button in the task bar. A webpage with this manual will download

TOP of page

License

This is free software. It can be freely distributed and installed. This program is provided to you by Quantitative Skills Research and Statistical Consultancy.

This software and the accompanying files are sold "as is" and without warranties as to performance or merchantability or fitness for a particular purpose. The entire risk arising out of use or performance of the software remains with you. Please read the Limitations page regarding the performance of the program.

TOP of page

Limitations

This program has very few in-built limitations. The maximum number of cases per table is 2,147,483,646. For an exact analysis or a Fisher analysis the maximum number of cases is 1.000.000. The maximum number of tables to determine an exact p-value is set at 2,147,483,646. This does not mean that you can do calculations on a table with 1.000.000 cases (or so) and be sure of a valid result. Complex formulas are used which can only function within the limits of computers, of this program and of the compiler's (Delphi) capabilities. The behavior of the program in extreme situations is hard to predict. The program has been thoroughly tested using examples from statistics books and by comparing its results with the results of other statistical programs. It was found to function perfectly. These tests were done with 'normal' examples, of the type encountered in 99.9% of research situations. Various experiments have been done using very large numbers and the behavior was extensively studied. Generally the program performed well, responding to impossible situations with floating point or variable out-of-range errors and crashing, or producing either no result or the "NaN" result. Encountering such errors means that what you want to do is not possible. We found that these errors occur particularly frequently in cases in which you do not really need a statistics program to tell you that the p-value is very close to zero or one. Furthermore, sometimes many millions of calculations are required to produce the results of the exact procedures. It might take from 30 to 45 hours to evaluate the maximum of 2,147,483,646 tables. For some procedures (particularly the ordinal tests), if you have problems with out-of-range errors or very lengthy calculations, it may help if you try to order your data in a different way. The programs 'flip', 'swap' and 'rotate' buttons may be of assistance.

Although this program has been tested extensively, no program is ever bug or error free, and you should always check your results carefully. You can check results by comparing different statistics: exact results with Chi-squares, p-values for Gamma with p-values for Tau, etc. However, even then there may be problems (though it is hard to imagine what they might be!) and no statistic can ever replace a healthy mind and a critical attitude in research. If you have a data set to analyze and you do not get the expected result, please report this to Quantitative Skills.

TOP of page

Kappa

Two measures are presented.

Kappa is a measure of agreement and takes on the value zero when there are no more cells on the diagonal of an agreement table than can be expected on the basis of chance. Kappa takes on the value 1 if there is perfect agreement, i.e. if all observations are on the diagonal. It is considered that Kappa values lower than 0.4 represent poor agreement between row and column variable, values between 0.4 and 0.75 fair to good agreement, and values higher than 0.75 excellent agreement.

Bowker Chi-square tests to see if there is a difference in the scoring pattern between the upper and the lower triangle (excluding the diagonal). If the Bowker Chi-square is statistically significant there is a difference: in other words, the pattern of scoring in the upper triangle is not the same as the lower triangle. Note that the pattern of scoring between the two triangles is dependent on two factors. First, whether there is a 'true' difference in the pattern of scoring. Second, the level of marginal heterogeneity. Marginal heterogeneity means that the marginals are different; this increases the Bowker Chi-square. The Bowker Chi-square is the same as the McNemar Chi-square in a two by two table.

TOP of page

Exact Test

This test is yet another derivative of Fisher's Exact Test. It generates all tables that are possible given the marginals, and cumulates the p-values of all tables that have the same number of off-diagonal observations as the table observed, or fewer. This exact test gives the probability of the exact number or a lesser number of observations of the marginal being there on the basis of chance. The test is equivalent to the single-sided Fisher in two by two tables. The p-value produced should be comparable with the Kappa p-value, and this does in fact usually seem to be the case. For a discussion see Agreement Models.

TOP of page

Agreement Models

Most models in tables are intended to test for statistical independence between the row and the column variable. They ask the question if knowledge of the row variable predicts someone's score on the column variable, and vice versa. For example, does someone's sex 'predict' his or her income, or does someone's schooling 'predict' his or her occupation? In these cases, independence would mean that sex does not predict income, and that schooling does not predict occupation.

However, one can also look at the pattern of the observations: the way they are distributed in a table. Examples of such models are migration or mobility models, in which we study the possible difference in the geographical or social location of individuals between two points in time, and agreement models, in which we compare the responses of different individuals on the same issue. All these models assume that observations are disproportionally more frequently located on the main diagonal, which is equivalent to the hypothesis that people did not migrate, were not mobile and agree. In other words, people tend to have the same score on two occasions and will be located on the main diagonal in a table (the main diagonal in a table is usually from the upper left, the 1,1 cell, to the lower right corner. In SISA tables the main diagonal is colored yellow). The basic migration/mobility/agreement model tests the hypothesis (H1) that the observations are disproportionally present on the diagonal as opposed to the outer, non-diagonal cells. This hypothesis is tested against the (H0) hypothesis that the observations are randomly distributed.

Two procedures are implemented in here. The first, an exact model, tests the likelihood that the observations are less frequently in the outer non-diagonal cells than can be expected on the basis of chance alone. The second procedure is Kappa, the measure most often used in agreement models. Kappa does a similar test and also gives a quantitative description of the strength of the relationship. These two measures will be discussed in sections under separate headings.

It should be noted that usually the theory underlying the migration/agreement models is very powerful and it occurs only rarely that the models are not statistically significant. The analysis should therefore serve only as a starting-point for further study. Examples of appropriate questions are: do people move more often to the upper than to the lower triangle, or are people more likely to move to cells closer to the origin cells than to cells which are further away. Agresti (1990) and Fleiss (1980) discuss the issues and give some examples.

Agresti A. Categorical Data Analysis. New York & Chichester: John Wiley and Sons, 1990.

Fliess JL. Statistical Methods for Rates and Proportions. New York & Chichester: John Wiley and Sons, 1981.

TOP of page

Copyright

Copyright: Quantitative Skills & Daan Uitenbroek PhD, 2000,2005,2008.

TOP of page

Warranty

Although this program has been tested extensively, no program is ever bug or error free, and you should always check your results carefully. This software is provided "as is" and without warranties as to performance or merchantability or fitness for a particular purpose. The entire risk arising out of use or performance of the software remains with you.

TOP of page

Download

Download the program here by double clicking this link and saving the program to a directory of your choice.

Compare Car Rentals!
Help SISA and compare two rental cars!
An easy way to find the best option.
www.quantitativeskills.com

 Tables  