Jachelle Anne D. Terrago
Julian J. Meimban III
Abstract
Satisfaction survey data from a University’s Science Fair was analyzed using nonparametric Mixed ANOVA (with aligned rank transformation). The dependent variable was the satisfaction score and the independent variables (between-subjects factors) were GENDER (male, female) and GROUP (student, exhibitor, and faculty). The variable ITEM (with seven items) was the within-subjects factor. A seven-item, 1-5 Likert scale (1-Very Poor, 2-Poor, 3-Satisfactory, 4-Good, 5-Excellent) questionnaire was administered to 162 respondents, randomly selected through proportionate stratified random sampling, and yielded 84.57% response rate. The seven items rated were Venue/place (VENUE), Inventiveness (INVENT), Commercial Viability (IMPACT), Educational Value (EDUC), Audio-Visual Presentation (PRESENT), Organization (ORGANIZE), and Overall Experience/Satisfaction (OVERALL). The instrument met validity test and reliability standard (Cronbach’s Alpha = 0.92). Aligned rank transformation was first applied to data using ARTool
v2.1.2 by Elkin et al. (2021) before Mixed ANOVA and post hoc analysis were performed using SPSS v27. Results revealed that only the first order interaction of GENDER and ITEM was found statistically significant, F(5, 667) = 2.816, p = .016 (with Greenhouse-Geisser correction), small effect size ( η2 =0.02). Post hoc analysis using dependent t test with Bonferroni correction showed that there were differences between male and female attendees in terms of which items were given lesser favorable ratings.
Keywords: Nonparametric Mixed ANOVA, Aligned Rank Transformation, ARTool with Contrast
Introduction
This paper illustrates the use of mixed ANOVA (between-subject and
within-subject design) test as an option for analyzing satisfaction rating data
since each respondent (who belonged to one of the three mutually exclusive
groups and was classified as either male or female) rated each of the seven
survey items. Rating data set gathered from actual satisfaction survey of a
University’s Annual Science Fair was used for this purpose. After the data was
found to have violated considerably the key assumptions of parametric mixed
ANOVA, nonparametric mixed ANOVA developed by Wobbrock et al. (2011)
was used.
Research Questions
The current analysis addresses the following research questions and
hypotheses.
Research Question 1: Do students, exhibitors, and faculty members rate
the survey items differently?
Ho: There is no significant difference in the satisfaction ratings
among groups.
Research Question 2: Do male and female attendees rate the survey
items differently?
Ho: There is no significant difference in the satisfaction ratings between
male and of female attendees.
Research Question 3: Is there a significant GROUP, GENDER, and
ITEM interaction?
Ho: The interaction of the independent variables is not significant.
All alternative hypotheses (Ha) are non-directional.
Method
Sample Size
The sample size of 159 was initially calculated based on 2 x 3 ANOVA design involving three groups, two gender, .80 power, .25 effect size (medium) for the main effects and interaction effect, and .05 significance level (Meimban et al., 2018). For the purpose of this study based on mixed ANOVA design involving three groups and seven repeated measures, the sample size of 159 has an achieved power of 96.7% in detecting main effects of medium effect size (f = .25) at .05 level of significance.
Description of the Sample
Out of the 159 survey questionnaires distributed, 137 forms were returned (84.57% response rate). Students registered the highest response rate (94.44%); exhibitors and faculty, 92.59% and 66.67%, respectively. Male
faculty had the lowest response rate (51.85%) (Table 1).

Survey Questionnaire and Distribution
A 7-item questionnaire was developed to determine the view and assessment
of attendees on the Science Fair. The instrument concerned about (1) venue/place (VENUE), (2) inventiveness (INVENT), (3) commercial viability (IMPACT), 4) educational value (EDUC), (5) audio-visual presentation (PRESENT), (6) organization (ORG), and 7) overall experience/satisfaction (OVERALL). Respondents were asked to rate each item using a 5-point Likert scale, where: 5 – Excellent, 4 – Good, 3 – Satisfactory, 2 – Poor, and 1 – Very Poor.
Six student-volunteers were trained, instructed, and tasked to administer
the distribution of questionnaire. Two student-volunteers were assigned to
survey each group. One student was tasked to survey male respondents; the other student, female respondents. Surveyors were instructed to select respondents randomly from among those who had gone to the Science Fair. They were further instructed not to choose friends or relatives and tell respondents to fill out the questionnaire independently and truthfully.
Validity and Reliability
Before it was used, the questionnaire underwent validity and reliability
testing. For content validity, the questionnaire was reviewed and rectified
by three test construction professionals. For reliability test, the instrument’s
internal consistency reliability was estimated using Cronbach’s Alpha. It
yielded .92, much greater than the .70 recommended minimum acceptable
value for responses to be considered reliable or consistent (Bohrnstedt and
Knoke, 1988). Inter-correlation of items or variables ranged from 0.48 to 0.78,
adjudged moderate to strong relationship (Salkind, 2000).
Data Analysis
Generally, normality and homogeneity of variance assumptions were
checked and found markedly violated. Only the rating of male faculty members for IMPACT exhibited normal distribution (Table 2). The variances of the ratings for VENUE, PRESENT, and OVERALL among the combinations of
GROUP and GENDER were not equal (Table 3). Moreover, the assumption of
equality of variance-covariance matrices [Box’s M: F(140, 18424.16) = 1.471,
p < .001] and the sphericity assumption [Mauchly’s W = 0.511, Approx. Chi-
Square (20) = 86.36, p < .001] were also severely violated. Considering these
findings, the data was analyzed using nonparametric ANOVA by Wobbrock
et al. (2011). First, aligned ranks transformation was applied to data using the program ARTool v 2.1.2 which was developed by Wobbrock et al. (2011) and
was enhanced for contrast and post hoc tests by Elkin et al. (2021). The data
were aligned for each effect (main and interaction) in a process by which each effect is estimated as marginal means and then other effects were stripped from the dependent variable (Wobbrock et al., 2011). The alignment results were then ranked.
The ARTool generated separate values of the response variable (Y) for
each effect presented in the research questions: (1) aligned and ranked Y for the first order interaction of GROUP and ITEM; (2) aligned and ranked Y for the first order interaction of GENDER and ITEM; and (3) aligned and ranked Y for the second order interaction of GROUP, GENDER, and ITEM.
The significance of the second order interaction effect was first
tested by running the mixed ANOVA in SPSS v27 in which the dependent
variable wass the aligned and ranked Y for the second order interaction
of GROUP, GENDER, and ITEM while the independent variables were
GROUP and GENDER. However, only the result for the second order
interaction was considered and the other effects were ignored. The same
process was repeated in testing the significance of first order interactions.
Thus, mixed ANOVA was ran three times, with different values for the
dependent variable each time, to complete the analysis. The SPSS outputs
were consolidated and summarized in Table 4. Post hoc analysis was done
using dependent t test with Bonferroni correction.


Results and Discussion
Result of nonparametric Mixed ANOVA, with Greenhouse-Geisser
correction, showed that there was no significant second order interaction,
F = 1.42, p = .169. This means that if the satisfaction rating per item was
analyzed at “cell” level or combination of GENDER and GROUP, there were no significantly different patterns found. In other words, if attendees were to
be classified according to GENDER and GROUP, there were no significant
differences found in their satisfaction ratings for the survey items. Thus, it can
be considered safe to examine the first order interaction and look at the pattern of differences in the ratings of GROUP and GENDER separately.
There was a significant first order interaction between GENDER and
ITEM, F = 2.76, p = .018 (with Greenhouse-Geisser correction), small effect
size (η2p = 0.02), observed power = .83. Post hoc analysis using dependent t
test revealed that there were differences between male and female attendees
in terms of which items were given lesser favorable ratings. Male attendees
appeared to have rated INVENT less favorably than VENUE [t(66) = -3.49,
p < .001, small effect size (Cohen’s d = 0.43)] and EDUC t(66) = -3.70,
p < .001, small effect size (Cohen’s d = 0.45). On the other hand, female
attendees seemed to have rated ORGANIZE and OVERALL less favorably
than EDUC (Table 5). Other pairwise comparisons of ITEM were found
not significant. Both male and female attendees have indicated greater
satisfaction for items VENUE and EDUC.
There was no significant first order interaction between GROUP and
ITEM, F = 0.94, p = .49 (with Greenhouse-Geisser correction). Looking at the
descriptive statistics, on the average, STUDENT, EXHIBITOR, and FACULTY
appeared to have expressed greater satisfaction for items EDUC and VENUE
and lesser satisfaction for item ORGANIZE.


References
Bohrnstedt, G. W. and Knoke, David. (1988). Statistics for social data analysis (2nd ed.). F. E. Peacock Publishers, Inc.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates, Inc.
Elkin, L. A., Kay, M., Higgins, J. J., & Wobbrock, J. O. (2021). An aligned rank transform procedure for multifactor contrast tests. In The 34th Annual ACM Symposium on User Interface Software and Technology. Virtual Event, USA: Association for Computing Machinery. doi:10.1145/3472749.3474784
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175-191.
Salkind, N. J. (2000). Exploring research (4th ed.). Prentice-Hall, Inc.
Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J. (2011). The aligned rank transform for nonparametric factorial analyses using only ANOVA procedures. In Proceedings of the SIGCHI Conference on Human Factors in ComputingSystems (pp. 143–146). New York, NY, USA: Association for ComputingMachinery. doi:10.1145/1978942.1978963
