Intracorrelation/designeffects in clustered proportional data

Beta Beta Beta. This is a beta version. This program requires further revision and validation

Purpose of the program
Installing the program
How to input data
What you find in the output field
Intra correlation coefficient
Intra correlation in clustered studies
The design effect, design factor and the effective n
Intra correlation in agreement studies

Purpose.This description is about the intra correlation free computer program. The intra correlation program calculates intra correlations and design effects for clustered samples were the outcome measure is the number of positive responses per cluster. Confidence intervals and other statistics corrected for designeffects can be calculated. It is possible to compare two groups of clusters with a t-test procedure.

TOP of page

Installing the program.Double click the program name on the SISA or Quantitative Skills website, select save and save the file to a directory of your choice on your computer. Run the program by double clicking the program name on your computer. You can make a shortcut to the program by right clicking the program name on your computer, make the shortcut, right click the short cut, and drag or copy and paste the short cut to another location. This program does not have an installation program, what you download is the program itself. Remove the program by right clicking the program and removing it. The program does not have a un-installation program, is also not required, as no additional files are installed on your computer. What you see is what you get and no more.

TOP of page

Input.The program expects a column of positive response data followed by a column of cluster sizes, one row for each cluster. The data can be typed directly into the input field or pasted from an external source into the input field.

The table below concerns 6 clusters. Press calculate, this will result in a Intracorrelation Rho of 0.0462 and a designeffect DEFF of 19.9142.


























Additionally it is possible to give an intra correlation value in the intra correlation box. This value will be considered in a number of calculations.

TOP of page

Output.The program outputs the number of positive responses, the total number considered, the intra correlation coefficient, the design effect considering that the cluster sizes are different and the confidence interval for the proportion of positive responses. If the “more output” option is activated additional output is provided. If the “split sample” option is activated and the sample is split a t-test comparing the proportions in the subsequent samples is given.

TOP of page

Options. There are 4 options.

The split option splits the data after each n-th user specified cluster. For this option to work the user needs to specify an integer value at which to split the data which is larger than one and smaller than the total number of clusters. If the file is split the proportion positive in the each of the sub samples will be compared with the proportion positive in the previous sub sample by way of a t-test and considering the design effects.

The decimals options sets how many decimals are used in numbers in the output

Change the Confidence Interval for all appropriate statistical procedures by choosing CI in the options menu and then ticking the level you want to select, or use the fill in box under the other options. The default setting of the confidence interval is 95%. Confidence intervals smaller than one percent or greater than 99.9998 percent are not allowed

Activate the “more output” option if you want more output

TOP of page

Warranty Although this program has been tested extensively, no program is ever bug or error free, and you should always check your results carefully. This software is provided "as is" and without warranties as to performance or merchantability or fitness for a particular purpose. The entire risk arising out of use or performance of the software remains with you. This program does not install any files on your computer or change settings without your permission .

TOP of page

License. This is free software. It can be freely distributed and installed. This program is provided to you by Quantitative Skills Research and Statistical Consultancy. Copyright: Quantitative Skills and Daan Uitenbroek PhD, 2008.

TOP of page

Intra correlation coefficient. The spreadsheet does intra correlation calculations for dichotomous or binary yes/no type outcome variables according to the method as proposed by Fleiss. The intra-correlation coefficient is used for a number of purposes, to estimate the design effect in clustered samples and to estimate the Kappa in agreement studies.

TOP of page

Intra correlation in clustered studies.The intra correlation is used as a measure of clustering in data that is collected using a multistage, multilevel, procedure of data collection. This is the case if, for example, a number of schools are selected randomly and in each school a number of pupils is sampled for study. The analysis is taking place at the pupil level. If there is intracorrelation the standard errors of statistics at the pupil level are estimated too narrow, tests become statistically significant too quickly and confidence intervals will be too small. The intra correlation coefficient gives you an insight as to what extent the observations at the individual level are influenced by clustering of observations in higher level groups, which is the case if, for example, schools are particularly good, and others particularly bad, at the trait measured with the dependent variable. If the intracorrelation coefficient equals 0 the schools are not different with regard to the independent variable, all the pupils could have been sampled from a single school and the result of the analysis would have been the same. The number of pupils is the correct sample size. If the intra correlation coefficient equals 1 the schools are totally different and pupil performance is totally influenced by the school, effective sample size is the number of schools, and not the number of pupils. The example in the table above concerns the proportion of pupils that passed a test. In the table the data concerns 6 schools with 1865 pupils, 158 were tested positive.

TOP of page

The design effect, design factor and the effective n.The intra correlation coefficient can be calculated into the design effect. The design effect (DEFF) is the factor by which the variance of an estimated mean increases after considering intra correlation caused by a clustered design. DEFF can be translated into the effective n^, which is an estimate of the n after considering the extra variance caused by the clustering in the data. The effective n^ is the observed sample n divided by the design effect, n^=n/DEFF. Lastly, the design factor (DEFFT) is the amount by which the standard error of an estimated mean increases after considering the clustering in the data. One minus the design factor times 100 (1-DEFFT*100) is the percentage by which a confidence interval around a mean increases due to the clustering. The design factor is the square root of the design effect, DEFFT=√DEFF.

TOP of page

Intra correlation in agreement studies.The intra-correlation is used as a one-to-one alternative to the kappa measure of agreement in case of multiple raters or judges considering the presence or absence of a trait on a larger number of items (objects) or individuals (subjects). High kappa means that there is a high level of agreement between the judges with regard to the presence of the trait. The data in the table can be seen as data on 1865 judgements, 158 judgements were positive. The data were generated by 6 judges. For a further discussion of kappa please see the SISA-tables help-file.

TOP of page


Fleiss JL. Statistical methods for rates and proportions, 2nd edition. New York [etc.]: John Wiley 1982.

TOP of page


Download the program here by double clicking this link and saving the program to a directory of your choice.

TOP of page

Beta Beta Beta. This is a beta version. This program requires further revision and validation

Compare Car Rentals!
Help SISA and compare two rental cars!
An easy way to find the best option.

Intracorrelation/designeffects in clustered proportional data