Julian J. Meimban III
Abstract
Satisfaction rating on a University’s Commencement Exercises was analyzed as within-subject factor design using nonparametric statistics after assumptions of parametric statistics were found markedly violated. Each of the 275 randomly selected respondents rated each of the five items of a survey questionnaire on a 1 – 5 Likert scale (1 – Very Poor, 2 – Poor, 3 -Satisfactory, 4 – Good, 5 – Excellent). The five items rated pertained to accessibility/cost (ACCESS), staff performance (AIDES), orderliness (ORDER), formality (FORMAL), and discipline/conduct (CONDUCT). The survey instrument met validity test and reliability standard (Cronbach’s Alpha = 0.84). Response rate was 84.7%. Friedman test showed statistically significant differences in the mean ranks of the five item rated (p < .001). Follow up Wilcoxon signed ranks test with Bonferroni correction revealed statistically significant pairwise mean rank differences. For instance, participants rated AIDES more favorably than ACCESS, but rated CONDUC less favorably than FORMAL. However, they rated the following pairs equally favorable: ORDER and AIDES, CONDUC and AIDES, and CONDUC and ORDER.
Keywords: satisfaction rating, Non-Parametric, Repeated-Measures Between-Factors ANOVA, Friedman Test, Bonferroni Correction, Mauchly’s test, Sphericity, Wilcoxon Signed Ranks test, Effect Size
Proponents and organizers of events such as Commencement Exercises are primarily interested to know the satisfaction ratings of attendees/participants. Ratings are feedback from participants, and organizers use these feedbacks to improve execution of future events.
Varied forms of feedback gathering and satisfaction assessment are readily available. Use of “direct-ambush” interview, drop box comments/suggestions, and survey questionnaire is common practice. Moreover, it is a common practice among researchers to employ parametric tests to analyze survey data. However, parametric tests are grounded on some specific statistical assumptions. Thus, parametric test results and findings get credence when key assumptions of the test are met. Otherwise, interpretation of results and discussion and subsequent recommendations should be viewed with suspicion.
This study used a survey questionnaire to get satisfaction rating of participants in a University’s Commencement Exercises. The Commencement Exercises was used only as a case to show the use of nonparametric test to analyze satisfaction rating data that does not satisfy parametric assumptions.
Research Question and Null Hypothesis
Are there differences in the satisfaction ratings among the five Likert items, α = .05? Ho: Satisfaction ratings among the five Likert items are equal.
Method
Sample Size
Sample size was calculated a priori based on repeated measures, between factors ANOVA involving 11 groups, 5 measurements, 80% power, 0.25 effect size (ES) (medium), and 5% significance level. For this specification, total sample size of 275 (or 25 cases per group) was needed as recommended by Cohen (1988) as mentioned in the G*Power 3.1.9.2 program (March 2017 version).
Survey Questionnaire and Distribution
The 5 – item questionnaire concerned about (1) accessibility of the venue (ACCESS), (2) performance of graduation staff (AIDES), (3) orderliness of the graduation procedure (ORDER), (4) formality/solemnity of ceremony (FORMAL), (5) discipline/conduct of graduating students (CONDUC), and (6) recommendation (RECOMD). The items were rated using a 5 – point Likert scale, where: 5 – Excellent, 4 – Good, 3 – Satisfactory, 2 – Poor, and 1 – Very Poor.
Respondents were randomly selected from the list of graduating students provided by the University Registrar’s Office via SPSS v23 random selection of cases command. Eleven student-volunteers were trained, instructed, and tasked to administer the distribution of questionnaire to graduating students. Surveyors were also instructed to distribute the questionnaire only when the graduating students had returned to their assigned seats after they had gone on the stage. Respondents were told not to confer or communicate by any means to anyone while answering the questionnaire. To ensure strict compliance of survey instructions, four assistant researchers were also tasked to supervise the distribution and assist filling out of the questionnaires.
Validity and Reliability
Before the questionnaire was distributed, it was tested for validity and reliability. For content validity, the questionnaire was assessed by three professionals in test construction. Their comments and suggestions were taken into consideration in finalizing the survey questionnaire. For reliability, internal consistency reliability was estimated using 226 respondents. The Cronbach’s Alpha obtained was 0.84, greater than the 0.70 recommended minimum acceptable value for responses to be considered reliable or consistent (Bohrnstedt and Knoke, 1988).
Assumptions
The survey rating data set was initially tested if it complied with key parametric assumptions for repeated-measures design, namely: normality, homogeneity of variance, and sphericity. For normality, Kolmogorov-Smirnov tests showed that ratings from the following groups were not normally distributed at α = .05: College of Business Administration (p = .030), College of Computer Studies (p = .013), Pampanga Branch (p = .001), Faculty (p = .024), and Parents (p = .044). For homogeneity of variance, Levene Statistic, F (10, 219) = 3.12, p < .001 was statistically highly significant, suggesting that this assumption was violated. For sphericity, Mauchly’s test (W = 0.840, p < .001) was also highly significant, indicating that this assumption was not met.
Statistical Test
Because the said parametric assumptions were not satisfied, in addition to the fact that the scale of measurement used was ordinal, nonparametric statistics was deemed suitable for analyzing the data set. Specifically, Friedman test was used to address the Research Question posted. And Wilcoxon Signed Ranks test was used for post hoc pairwise comparison following significant Friedman Test.
Results and Discussion
Respondents
Out of the 275 survey questionnaires distributed and returned, 231 to 233 turned out usable because some survey forms were found improperly filled out or had missing entries. Thus, this study was based on 84.0% to 84.7% response rate.
The number of respondents who rated the graduation ceremony Good or Excellent is evidently far greater than those who rated it Poor or Very Poor (Table 1). The percentage of respondents under combined Good and Excellent rating, which ranges from 57.9% to 84.5%, is overwhelmingly greater than that under combined Poor and Very Poor rating, which ranges only from 2.6% to 9.0% (Table 2). Thus, majority of the respondents rated the graduation ceremony Good to Excellent. A far smaller number rated it Poor to Very Poor.

Answer to Research Question
Are there significant differences in the satisfaction rating among the first five Likert items, α = .05?
Friedman’s test yielded statistically significant result, Chi-Square (4, N = 230) = 86.60, p < .001. And Wilcoxon signed ranks test with Bonferroni correction [α = (.05/10) = .005] revealed significant item mean rank differences with their corresponding ES (r) (Table 2). For instance, mean rank for AIDES (“How would you rate the performance of the volunteers/aides assisting during the graduation ceremony?”) was found significantly greater than that for ACCESS (“How would you rate the accessibility – considering distance, cost, and ease of travel – of the graduation venue?”). This finding suggests that participants in the Commencement Exercises rated AIDES more favorably than ACCESS. On the other hand, mean rank for CONDUC (“How would you rate the discipline/conduct showed by the graduating class?”) was significantly lower than that for FORMAL (“How would you rate the solemnity/formality of the graduation ceremony?”), indicating that participants rated FORMAL more favorably than CONDUC. Differences in the mean ranks for the following pairs were found not statistically significantly different: ORDER and AIDES, CONDUC and AIDES, and CONDUC and ORDER.


References
Bohrnstedt, George W. and David Knoke. (1988). Statistics for Social Data Analysis (2nd ed.). F. E. Peacock Publishers, Inc.
Cohen, Jacob. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates, Inc.
G*Power 3.1.9.2 program, March 2017 version.
Salkind, Neil J. (2000). Exploring Research (4th ed.). Prentice-Hall, Inc.
