USING PERCENTAGES TO CALCULATE CHI-SQUARE TESTS OF INDEPENDENCE
Abstract
The commonly accepted assumptions for the Chi-square test of independence are the sample is randomly selected, observations are independent, and the size of expected frequencies is at least five. Alternatively, some sources include as many as six assumptions, and regardless of whether they explicitly claim it is an assumption, there is a consensus that the cell data used for calculating Chi-square should not be percentages or any other type of transformed data. Standard practice is to convert percentages to frequencies which can create problems because cell values could ultimately be an impossible decimalized fraction. For instance, if the conversion from percentage data to frequency data generated cell values of 28.2 or 28.9 people, no such thing as a 0.2 or 0.9 person exists. Those cell values would be adjusted down to 28 and even though transformed data is frowned upon, the rounding down produces an acceptable statistic because doing so generates a more conservative estimate than the mathematical practice of rounding to the nearest whole number. Standard mathematical rounding to 28 and 29 could potentially generate two different solutions, but rounding down for both would generate the same solution for 28.2 and 28.9. Numerical manipulations are ultimately at the discretion of the statistician. The current research used contrived data to illustrate when using percentages to calculate Chi-square tests of independence is acceptable and cost of doing so.
Recommended Citation
Thomas, Mark D. and Brown, Pamela P.
(2023)
"USING PERCENTAGES TO CALCULATE CHI-SQUARE TESTS OF INDEPENDENCE,"
Georgia Journal of Science, Vol. 81, No. 1, Article 144.
Available at:
https://digitalcommons.gaacademy.org/gjs/vol81/iss1/144