1.0 DATA GROUPING
1.1 Data classes
The initial step in preparing a table is data grouping. The objective of grouping is to summarize data for presentation
(parsimony) while preserving a complete picture. Some information is inevitably lost by grouping. The suitable number of classes
is 10-20. Using very few classes masks details of the distribution. Using too many classes nullifies the objective of parsimony.
The desirable characteristics of classes are: mutual exclusiveness, equality of the width of class intervals, and coverage
of all the data. Class limits (also called class boundaries) are of 2 types: true and tabulated. True class boundaries are
more accurate and may be decimalized. They are however difficult to tabulate. True class limits should conform to data accuracy
(decimals & rounding off). The tabulated class limits are usually whole numbers and are an approximation. We sometimes
talk about true upper class limit (UCL) and true lower class limit (LCL). The class mid-points are used in drawing line graphs.
1.2 Dichotomy/trichotomy
Data grouping is sometimes achieved by dividing it into 2 groups (dichotomy), 3 groups (trichotomy), and many groups
(multichotomy).
1.3 Grouping errors
Grouping error is defined as information loss due to grouping. Grouped data gives less detail than ungrouped data.
The bigger the class interval, the bigger the grouping error. The parsimony advantage of data grouping must be considered
against the extent of grouping error. Computations on grouped data are usually based on the mid-point. Grouping error becomes
serious when the distribution of scores about the mid-point is not uniform.
2.0 DATA TABULATION
2.1 Objective of data tabulation
Tabulation has the objective of presenting and summarizing a lot of data in logical groupings and for 2 or more variables.
It allows visual inspection of the data.
2.2 Type of information presented in tables
A table can show the following summaries about data: cell frequency or cell number, cell number as a percentage of
the overall total, cell number as a row percentage, cell number as a column percentage, cumulative frequency, cumulative frequency%,
relative (proportional) frequency, and relative frequency %.
2.3 Characteristics of an ideal table
An ideal tables is simple, easy to read, and is correctly scaled. The layout of the table should make it easy to read
and understand the numerical information. The table must be able to stand on its own i.e. understandable without reference
to the text. The table must have a title/heading that should indicate its contents. Labeling must be complete and accurate:
title, rows & columns, marginal & grand totals as well as units of measurement. The field labels are in the margins
of the table. Numerical data is in the cells that are in the body of the table. Footnotes may be used to explain the table.
2.4 Configurations of tables
A contingency table can be presented in several configurations. The commonest is the 2 x 2 contingency table. Other
configurations are the 2 x k table and the r x c table.