Comparing Two Sets of Data
David Cheng:
Suppose you have ten items in your survey. First you ask people to rate in
a scale of 1-5 how important it is that they feel about these ten statements (say, the
items are about the effort made by a college to recruit students). Then you ask
respondents to rate, again, these ten items in a scale of 1-5 how well the college does.
Sure you will get two sets of data (means, ranking of the ten, etc.). Here comes the time
to analyze the data. What would you do to make your analysis both simple and informative?
For instance, people think the admissions office is the most important in a college's
recruitment (the mean score of that item is the highest), but they later rate your
college's admissions office lower than other items on the list. Would a simple difference
between the rankings of these two means of the same item too simple-minded? If not, what
other things do you do to make sense out of these two sets of data?
Charles Zhao:
I do not think it will help a lot to compare apples
(importance of certain tasks) and oranges (efforts to perform these tasks) even when they
are physically weighed on the same scale (your Likert scale). What you survey data tell
you are scenarios such as
For the most important task (rated 5), the college has done a
poor job (rated 2); for the least important task (rated 1), the college has done a
wonderful job (rated 5).
So my approach would be a crosstab-type of data interpretation, that is, to evaluate
the college task performance by the importance of the task.
Jing Luan:
Charles' idea is valid in that you can use a dichotomous table - one of my
favorites. Based on the scale levels you have (let's say 1-5 for the sake of argument),
you can have a total of 25 cells with 5 on the Y axis and 5 on the X. Y represents
Importance (or anything else) and X represents Accomplishment (or anything else). This
type of analysis helps the most in studies where you need to identify two groups: most
important and most accomplished and most important and least accomplished.
Bai Kang:
I share the same opinion with Charles and think that it is a question of
difference between the importance of the task and the performance of the task in terms of
the perception of the people who do the ratings. The ratings of the importance could
indicate people's expectations and if the differences between the importance and the
performance were found statistically significant, say ratings of importance higher than
ratings of performance, I would think that their expectations were not met and sure they
were not satisfied with the performance of the task. On the other hand, if performance got
a significantly higher rating than importance, I would think that people feel good about
how the task was performed to meet their needs or even people thought more than enough
efforts had been put in the performance of the task. This could be another way to evaluate
satisfaction and furthermore it also tells us in which areas our effects should go for
improvement.
Hom, Willard:
Your question is an excellent one. This is a common issue in the survey
work for marketing research. I would use a graph of the two "scores" showing how
they deviate from a 45-degree diagonal line (a "ray" from the origin or the 0,0
coordinate). The 45-degree reference line on this graph indicates perfect agreement of the
two measures for each respondent.
A very good extension of this graph is called a
quadrant analysis or "importance/performance" chart. In using this approach, you
need to have questions with endpoint anchors (the "polar labels" for the scale)
that are like "very unimportant-----very important" and "very good
performance----very poor performance." With this kind of data, the two measurements,
the importance dimension question and the performance dimension question, form the same
two axes of the graph. But with this scaling you have the midpoint of the two questions
(the value of 3 on the 5-point scale) as the origin. The resulting graph should have four
squares delimited by the axes (hence, the label "quadrant analysis").
If you assign the higher numerical scores to more importance and better performance,
you will consider any scores that appear in the upper right quadrant as excellent
performance feedback from respondents. The lower left quadrant contains scores that
indicate low performance on factors that don't matter that much. You interpret these to
have less urgency to your administrators. The other two quadrants have the following
interpretation. If an abundance of scores land in the quadrant indicating low performance
on a high importance factor, then administrators need to address this as urgent. If an
abundance of scores land in the quadrant indicating high performance in a low importance
factor, then administrators may need to revisit the expenditure of effort to this factor;
it may be overkill (i.e., wasting resources). You're doing a good job on something
respondents don't value that much.
One problem that I have encountered is that leniency bias or lack of respondent
discrimination results in all the scores falling in the upper right quadrant. In this
case, you might try standardizing the scores so that scores fall into all four quadrants.
Of course, the interpretation of the results is "relativistic" rather than
absolute in this modification because this plot has "exploded" the upper right
quadrant to form all four quadrants.
The quadrant analysis is flexible enough to use it with individual respondent scores or
group scores. With group scores, you can get a very good picture of how groups, defined by
some relevant variable such as gender or age, you can discover segments of your school's
"market" that may need a select approach, rather than a general approach. This
is the idea behind the practice of "market segmentation."
You can enhance the group chart's information with a little effort. Some software
packages let you indicate the size of the group via the relative size of the chart marker
or symbol for the chart point. This allows you to show population size weights on a
two-dimensional medium.
In closing, I emphasize the graphical approach because of its simplicity. I also
believe that subtracting an importance score from a performance score can mislead some
people. Importance and performance are two different constructs that perhaps should remain
distinctly separate. Subtracting one from the other may imply to some that they are
equivalent constructs.
I hope I did not overdo this reply. It was a fine question. I just have a bit of
background in this particular topic.
Brian Hu:
A good question for this group. I am glad a few of us have already
provided some answers for your reference. I have a concern from another aspect. When
similar questions have different answers on the same scale, it may involve the structural
problem of the instrument. So use reliability test to measure such problems in your
instrument. This could be done in your pilot survey or after real survey. Both SPSS and
SAS have detailed explanations for how to do that. The test will show how correlations
between items affect the interpretations of the result of your survey. I did it once for
my dissertation with very strong interpretations to support my survey instrument. Now,
just can't remember the details for the reliability measurement.
Jeffrey Chen:
Also, David, if you plan to use the same instrument many times. You will
have no trouble to set you bench mark (or for presentation purpose). Is the survey
questionnaire you are using also been used in other places so you can compare?
Shuqin Guo:
David's results don't have anything to do with the reliability of the
instruments because the same instrument was used for two totally different purposes. I
would put the purpose of the survey into consideration when I analyze the data. I would
rank the items by the responses for both importance and performance. If the university has
a list of importance, I would compare this list with the list given by the students from
the result of the survey and see the differences there. The difference of these two lists
will tell us how differently students see things from the administration and will also
provoke some discussion within the administration about whether they would change the list
according to the students' opinion. As for the performance indicators, I would create two
tabulation tables: 1. the university list with performance evaluation. This will tell us
how the university did realizing its own goals. 2. the students' list with performance
evaluation. This will tell us how the university did to meet students' expectations. I
would also look at the ranking of the items by performance to find out what students are
most satisfied with and what least. e.g., the admission is ranked number 1 on the
students' importance list, but number 10 on the university list; for the performance
students rated poor, that means in the students' opinion the university did a poor job for
an important task. But the result shouldn't be a big surprise to the university because it
didn't put that much effort on admission. Does it make sense? What I want to say is that
the original purpose of conducting the survey should also be considered at the analysis
stage.
Huiming Wang:
The Noel-Levitz group has a "Student Satisfaction Inventory"
(SSI) program measuring the two sets of perceptions very similar to the design of yours.
They report the results using the method Willard just introduced. If you would, you could
check their webpage "www.neollevtiz.com". They will send you sample instrument
with the report upon your request.