OCAIR

Overseas Chinese Association for Institutional Research
An AIR Affiliate That Supports IR Professionals Since 1996

Comparing Two Sets of Data

David Cheng: Suppose you have ten items in your survey. First you ask people to rate in a scale of 1-5 how important it is that they feel about these ten statements (say, the items are about the effort made by a college to recruit students). Then you ask respondents to rate, again, these ten items in a scale of 1-5 how well the college does. Sure you will get two sets of data (means, ranking of the ten, etc.). Here comes the time to analyze the data. What would you do to make your analysis both simple and informative? For instance, people think the admissions office is the most important in a college's recruitment (the mean score of that item is the highest), but they later rate your college's admissions office lower than other items on the list. Would a simple difference between the rankings of these two means of the same item too simple-minded? If not, what other things do you do to make sense out of these two sets of data?

Charles Zhao: I do not think it will help a lot to compare apples (importance of certain tasks) and oranges (efforts to perform these tasks) even when they are physically weighed on the same scale (your Likert scale). What you survey data tell you are scenarios such as

For the most important task (rated 5), the college has done a poor job (rated 2); for the least important task (rated 1), the college has done a wonderful job (rated 5).

So my approach would be a crosstab-type of data interpretation, that is, to evaluate the college task performance by the importance of the task.

Jing Luan: Charles' idea is valid in that you can use a dichotomous table - one of my favorites. Based on the scale levels you have (let's say 1-5 for the sake of argument), you can have a total of 25 cells with 5 on the Y axis and 5 on the X. Y represents Importance (or anything else) and X represents Accomplishment (or anything else). This type of analysis helps the most in studies where you need to identify two groups: most important and most accomplished and most important and least accomplished.

Bai Kang: I share the same opinion with Charles and think that it is a question of difference between the importance of the task and the performance of the task in terms of the perception of the people who do the ratings. The ratings of the importance could indicate people's expectations and if the differences between the importance and the performance were found statistically significant, say ratings of importance higher than ratings of performance, I would think that their expectations were not met and sure they were not satisfied with the performance of the task. On the other hand, if performance got a significantly higher rating than importance, I would think that people feel good about how the task was performed to meet their needs or even people thought more than enough efforts had been put in the performance of the task. This could be another way to evaluate satisfaction and furthermore it also tells us in which areas our effects should go for improvement.

Hom, Willard: Your question is an excellent one. This is a common issue in the survey work for marketing research. I would use a graph of the two "scores" showing how they deviate from a 45-degree diagonal line (a "ray" from the origin or the 0,0 coordinate). The 45-degree reference line on this graph indicates perfect agreement of the two measures for each respondent.

A very good extension of this graph is called a quadrant analysis or "importance/performance" chart. In using this approach, you need to have questions with endpoint anchors (the "polar labels" for the scale) that are like "very unimportant-----very important" and "very good performance----very poor performance." With this kind of data, the two measurements, the importance dimension question and the performance dimension question, form the same two axes of the graph. But with this scaling you have the midpoint of the two questions (the value of 3 on the 5-point scale) as the origin. The resulting graph should have four squares delimited by the axes (hence, the label "quadrant analysis").

If you assign the higher numerical scores to more importance and better performance, you will consider any scores that appear in the upper right quadrant as excellent performance feedback from respondents. The lower left quadrant contains scores that indicate low performance on factors that don't matter that much. You interpret these to have less urgency to your administrators. The other two quadrants have the following interpretation. If an abundance of scores land in the quadrant indicating low performance on a high importance factor, then administrators need to address this as urgent. If an abundance of scores land in the quadrant indicating high performance in a low importance factor, then administrators may need to revisit the expenditure of effort to this factor; it may be overkill (i.e., wasting resources). You're doing a good job on something respondents don't value that much.

One problem that I have encountered is that leniency bias or lack of respondent discrimination results in all the scores falling in the upper right quadrant. In this case, you might try standardizing the scores so that scores fall into all four quadrants. Of course, the interpretation of the results is "relativistic" rather than absolute in this modification because this plot has "exploded" the upper right quadrant to form all four quadrants.

The quadrant analysis is flexible enough to use it with individual respondent scores or group scores. With group scores, you can get a very good picture of how groups, defined by some relevant variable such as gender or age, you can discover segments of your school's "market" that may need a select approach, rather than a general approach. This is the idea behind the practice of "market segmentation."

You can enhance the group chart's information with a little effort. Some software packages let you indicate the size of the group via the relative size of the chart marker or symbol for the chart point. This allows you to show population size weights on a two-dimensional medium.

In closing, I emphasize the graphical approach because of its simplicity. I also believe that subtracting an importance score from a performance score can mislead some people. Importance and performance are two different constructs that perhaps should remain distinctly separate. Subtracting one from the other may imply to some that they are equivalent constructs.

I hope I did not overdo this reply. It was a fine question. I just have a bit of background in this particular topic.

Brian Hu:

A good question for this group. I am glad a few of us have already provided some answers for your reference. I have a concern from another aspect. When similar questions have different answers on the same scale, it may involve the structural problem of the instrument. So use reliability test to measure such problems in your instrument. This could be done in your pilot survey or after real survey. Both SPSS and SAS have detailed explanations for how to do that. The test will show how correlations between items affect the interpretations of the result of your survey. I did it once for my dissertation with very strong interpretations to support my survey instrument. Now, just can't remember the details for the reliability measurement.

Jeffrey Chen: Also, David, if you plan to use the same instrument many times. You will have no trouble to set you bench mark (or for presentation purpose). Is the survey questionnaire you are using also been used in other places so you can compare?

Shuqin Guo: David's results don't have anything to do with the reliability of the instruments because the same instrument was used for two totally different purposes. I would put the purpose of the survey into consideration when I analyze the data. I would rank the items by the responses for both importance and performance. If the university has a list of importance, I would compare this list with the list given by the students from the result of the survey and see the differences there. The difference of these two lists will tell us how differently students see things from the administration and will also provoke some discussion within the administration about whether they would change the list according to the students' opinion. As for the performance indicators, I would create two tabulation tables: 1. the university list with performance evaluation. This will tell us how the university did realizing its own goals. 2. the students' list with performance evaluation. This will tell us how the university did to meet students' expectations. I would also look at the ranking of the items by performance to find out what students are most satisfied with and what least. e.g., the admission is ranked number 1 on the students' importance list, but number 10 on the university list; for the performance students rated poor, that means in the students' opinion the university did a poor job for an important task. But the result shouldn't be a big surprise to the university because it didn't put that much effort on admission. Does it make sense? What I want to say is that the original purpose of conducting the survey should also be considered at the analysis stage.

Huiming Wang: The Noel-Levitz group has a "Student Satisfaction Inventory" (SSI) program measuring the two sets of perceptions very similar to the design of yours. They report the results using the method Willard just introduced. If you would, you could check their webpage "www.neollevtiz.com". They will send you sample instrument with the report upon your request.