Julian J. Meimban III
How to Cite:
Meimban III, J. J. (2024). Sensitivity of a 2 x 3 ANOVA on normality and homogeneity of variances assumptions. NEU Likha Journal: A Refereed Journal of the New Era University School of Graduate Studies, 1(1), 33-48. https://doi.org/10.64303/n3u-Lailj24Hk0un2o-r-Sea0fAnoah0vA
Abstract
This study explored the differential results of parametric test and nonparametric test as applied to a 2 x 3 ANOVA. The data used in this study were those of NEU’s 2018 Science Fair Satisfaction Survey with the mean satisfactory rating (AVG) as the dependent variable for the original or untransformed data. The AVG for the six cells of the factorial design exhibited varying degree of negative skewness (-0.562 to -1.293). Five data transformation techniques (second to sixth power) were tried to satisfy assumptions of normality (skewness close to zero) and variance homogeneity (by Levene’s test). The fourth power transformation produced the data set that was closest to normality and simultaneously satisfied homogeneity of variances. Alternatively, nonparametric test was employed. The 2 x 3 ANOVA of untransformed and transformed data had the same output as Kruskal Wallis test in terms of the significance of main effect. ANOVA of untransformed and transformed data had the same result in the test of significance of interaction effect but there were differences in the result of post hoc tests brought by the effect of transformation on the homogeneity of variances. Transformation also had effect on power and effect size. Power and effect size tend to decrease as the power of transformation increased.
Keywords: Data Transformation, Skewness, Kruskal Wallis test, Homogeneity of Variances, Power, Effect Size
Introduction
Before applying statistical techniques such as one sample t-test and ANOVA, the general rule is to examine the data if it satisfies the different assumptions, one of which is the assumption of normality. If the distribution of the data is not normal, the data may be transformed before applying parametric techniques. Otherwise, nonparametric techniques can be used as a remedial option. However, there were studies that used parametric tests
without testing the assumption of normality. They justified that tests like onesample t-test and ANOVA are robust and can be applied to non-normal sample if the sample size is greater than 30. This study explored the effect of violation of normality by comparing the result of 2 x 3 ANOVA on untransformed and transformed actual survey data and also with the result of nonparametric tests on untransformed data.
Research Questions
What is the effect of violation of normality on the result of 2 x 3 ANOVA? Specifically, are there differences in the result of these parametric tests using untransformed data and transformed data and the result of the corresponding nonparametric tests?
Method
This study utilized the result of satisfaction survey of NEU’s 2018 Science Fair. There was a total of 151 respondents who were classified into three groups: students – 50, faculty – 51, and exhibitors – 50. Of the 151 respondents, 74 were male and 77 were female. The questionnaire consists of questions about (1) venue/place (VENUE), (2) inventiveness (INVENT), (3) commercial viability (IMPACT), (4) educational value (EDUC), (5) audiovisual
presentation (PRESENT), (6) organization (ORG), (7) overall experience/ satisfaction (OVERALL). Respondents were asked to answer each question using a 5-point Likert scale, where: 5 – Excellent, 4 – Good, 3 – Satisfactory, 2 – Poor, and 1 – Very Poor. With high estimate of internal consistency and reliability (Cronbach’s Alpha = 0.898) and inter-correlation of items (0.24 to 0.71), a summated score (SCORE) was used as index of the over-all rating and was computed as follows: SCORE = VENUE + INVENT + IMPACT + EDUC + PRESENT + ORG + OVERALL. The average SCORE, denoted by AVG, was used in this study as the dependent variable.
The distribution of average SCORE (AVG) and AVG for female student did not satisfy assumption of normality as indicated by the measure of skewness which was less than -1 (Table 1). Different transformations were considered in this study and fifth power of AVG had the closest skewness to zero (Table 2). Nonetheless, the transformation using the second power until the sixth power were used in this study. The untransformed and transformed data were analyzed using factorial ANOVA to answer a set of null hypotheses
(Table 3). Results from these tests were compared along with the result of nonparametric test in order to determine the effect of violations of normality.




Results and Discussion
ANOVA Test (Untransformed vs Transformed) vs Kruskal Wallis Test Results of the tests for main and interaction effects were the same for transformed and untransformed data. The interaction of GROUP and GENDER was not significant with p-value ranging from .413 to .619 (Table 4). For the main effects, GROUP was highly significant while GENDER was found not significant (Table 4). These results were the same with the result of Kruskal
Wallis Test (Table 5). However, observed power and effect size decreased as the power of dependent variable increased. The observed power and effect size of ANOVA using untransformed data for main effect GROUP were computed at .98 and 0.35, respectively. Whereas when transformed data was used, observed power ranged from .92 to .98 and effect size ranged from 0.30 to 0.34.



Plot of Marginal Means
Further analysis of the plot of marginal means showed that line of MALE intersects with FEMALE between the STUDENT and FAC groups (Exhibit 1). This was true for all data transformations which also produced similar plot of marginal means. Hence, there could be a significant GROUPGENDER interaction that was not detected by ANOVA test. To check this, a variable cellcode was introduced which denoted combinations of GROUP and
GENDER: 1 – STUDENT, MALE; 2 – STUDENT, FEMALE; 3 – EXHIBITOR, MALE; 4 – EXHIBITOR, FEMALE; 5 – FACULTY, MALE; 6 – FACULTY, FEMALE. Cellcode was analyzed using ANOVA and it was found significant for both untransformed and transformed data (Table 6).








Post Hoc Test of ANOVA for Untransformed and Transformed versus
Mann-Whitney U Test
Square transformation failed to homogenize group variances. On the other hand, cube to sixth power transformation equalized error variances (Table 6). Where equal variances were met, Tukey HSD was used for post hoc test; otherwise, Games-Howell was employed instead. Games-Howell detected the same pairs of cellcodes with significant difference for AVG and sqAVG (male faculty and male students, male faculty and female exhibitor, and female faculty and female exhibitor). Tukey HSB was more conservative and did not include the pair of female faculty and female exhibitor as having significant difference. For sixth_AVG, Tukey HSB had detected only one pair (male faculty and male student) with significant difference (Table 7). For the interpretation of the results of Mann-Whitney U Test, α was adjusted from .05 to .003 to compensate for the 15 comparisons made. Due to this, only the pair of male faculty and female exhibitor was found significantly different (Table 8).


References
Bohrnstedt, G. W. and Knoke, David. (1988). Statistics for social data analysis (2nd ed.). F. E. Peacock Publishers, Inc.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates, Inc
Salkind, N. J. (2000). Exploring research (4th ed.). Prentice-Hall, Inc.

