Simple Interactive Statistical Analysis

Go to procedure (data input)
Go to procedure (table input)

Frequency Table

Explanation.

Frequency tabulates data of a single variable giving the frequency of each observed value. The command is comparable with SPSS's frequencies. Basic statistics such as the mean and standard deviation are given. Besides numerical analysis the procedure also allows simple textual analysis.

Data can be pasted or typed into the input field in any format, such as a row or column of data or a paragraph of text. Separators between numbers or words are spaces, returns, semicolons, colons and tabs. The following options can be checked:

Include with data input.

Numbers. Non-Numbers are not included in the analysis. Numbers are treated by value, thus 10 and 10.0 are the same, and 5 is smaller than 12. Additional separators are $ and #, thus financial data is considered by value. Additional number base statistics such as the mean and standard deviation are given.

Text. Numbers are not included in the analysis. Additional separators between letters and words are ",'.()[]{}?!

Both. Both numbers, letters and words are included in the analysis. Numbers are treated by name, thus 10 and 10.0 are in different categories and 5 is larger than 12. There are no additional separators, thus a word in the middle of a sentence is classified differently from a word at the end of a sentence.

Options with data input.

Read weights considers that every other second- value is the case weight of the previous first- value. The case weights must be numerical, if not the case with its previous value is ignored. A weighted frequency table is produced and in numerical analysis various weighted and weighing corrected statistics are produced. For a discussion of data weighing and the correction applied please read this paper.

Sort Descending Sorts the values descending.

Lowercase All. Lowercase all non numerical alphanumeric characters. Use this option if you want to categorize text data case insensitive.

Show Rows limits the number of rows displayed. Particularly relevant if you request a large Nice Table. Can also be used to exclude particularly high or low (after "Sort Descending") (missing) values from the analysis.

Solve problems into 99999.9. Change the data sequence -carriage return-line feed-tab- and the sequence -tab-carriage return-line feed- into 99999.9 if labels or delete the case if weigths. Wil mostly solve the problem of system missing values in data copied and pasted from SPSS. Might cause other problems.

Table input.

Table input is to analyze a numerical frequency table were you have the values and the frequency or size of a number of groups but not the individual level data. For this procedure you must first define the number of rows in your table. A table filled with zero's and the number of rows you requested will be generated in a new webpage titled "Make Frequency Table" and you can fill your data into this table. You can also read the data from the optional input field. Data in this field consist of the group value and the frequency of the value in two columns of numerical data. Additionally a group name or label can be given in a third column of text data preceding the previous two columns of numerical data. For this you need to check the "Read labels in 1st column" option. The data in the input field has to be separated by spaces, or returns, or semicolons, or colons, or tabs, but can NOT be separated by comma's or full stops. The table specification gets precedence over the input field. Thus, if you have a too large table in the input field the table is truncated, if you have a small table additional zero's are added to the table generated for the next "Make Frequency Table" web page.

Box Plot.

A box plot option is given for numerical continuous quantitative outcome data, such as age, or length. Box plots are used to study the distribution of the data. Is the data equally distributed around the mean, are the means or standard deviations influenced by outliers? Box plots require ample observations with many different values. Box plots are not meant for use in categorical data such as an ordered frequency tables or a Likert scales. However, if there are many categories based on numerical values box plots can be useful to explore data in ordered tables. It is not recommended to make a box plot if you have fewer than 15 categories.

SISA uses a standard boxplot technique. The box runs from the first to the third quarter number of observations of the value ordered data, the median at 50% is the middle line inside the box. The whiskers of the box are the first valid values inside 1.5 times the interquartile range downward from the first and upward from the third quartile. Outliers are values removed more than 1.5 times the interquartile range from the box. Outliers are starred. The star within the whiskers or the box itself is the mean. No such star no mean inside these value.

Limitation.

The formatting and tabulating of large data sets might take a while in which case there might be warnings, just select "continue" and in the end the computer will get there.

TOP of page