OCAIR

Overseas Chinese Association for Institutional Research
An AIR Affiliate That Supports IR Professionals Since 1996

Which Test or Analysis for Finding Factors That Relate to Retention?

6/8/2006

Lan Hao:

My initial question

I have a project that I am working on and would really appreciate your help.

At the community college that I am temporarily working for, we had a problem of low enrollment for Spring 06. We want to find out why.

Now I have pulled all the students who enrolled at Fall 05 (about 14,000), and about 8,000 continued into Spring 06, and the other 6,000 did not enroll in Spring 06. (There are another 5,000 new students in Spring 06, but they are not of my interest here)
 
I have grades, units taken, financial aid info etc in Fall 05 for both groups of students (continued and not continued)

The project question I am trying to answer is: Is there any relationship between grades, units, financial aid etc (all these independent variables) and the decision to continue to enroll in Spring 06 or not (the yes and no dependent variable)?

Having taken my stats classes many years ago, my questions to our fellow statisticians and SPSS gurus are:

1. Which test should I use to do the analysis?
2. The practical way of doing it in SPSS.

Thank you so much.

Sincerely, Lan


Liu Xia:

First, make sure what you pulled is representative sample from the population.

Try logistic regression to see whether the independent variable can
explains. Since the response variable is binary, so use SPSS: Analysis->Regression->binary logistic.

Good luck.


Andrew Som:

Do you have access to National Student Clearinghouse. It can quickly tell
you if your students are transferring out and where.

http://www.studentclearinghouse.org/colleges/Tracker/default.htm

Good Luck


Allen Joseph Medwick:

Are you thinking of using logistic regression?


Ava Lee-Pang:

I was working in the Instructional Office for over 16 years as a specialist.  What I found was that the enrollment, maybe most likely be more related to course offerings (transferable courses, courses to complete certificates and degrees), course offering patterns (i.e.: can a student take 12 units each term and complete the certificate and/or degree within the time one/two years?), class time patterns (i.e.: 9-10 MTWRF/MWF, 10-11 MTWRF/MWF, 11-12 MTWRF/MWF).
 
Of course, financial aid will also be the factor.

Then the next factor will be student services (i.e. outreach, retention, tutoring, etc.)

Hope these will help.


Brian Hu:

Normally you can combine the two groups together and create a flag called ¡§continue enroll¡¨ with the value ¡§1¡¨ = continued and ¡§0¡¨ = dropped. Use logistic regression on this dichotomy variable as dependent variable and get the results for all significant independent variables.  Read some text about logistic regression, if necessary. Hope this helps.


Hongmei Zhu:

I did similar project last year try to find out factors to students retention rate, besides grade and financial aid information, I also used a lot of student¡¦s characteristic variables like gender, ethnic, program they are studying, age, and their engagement in campus activities¡K.all the factor in your institution you think might affect their decision to drop out.
The method I used to analysis is first do Chi-square test for each variable, see if there is significant different between those drops and continues, then put all the significant variables to fit a logistic regress model.
I don¡¦t have SPSS in my computer, I did all the analysis in SAS, but I think you can try it in SPSS too.
 
Good luck!


Willard Hom:

As other people have suggested, the list of predictors (causal factors?) for continued enrollment go beyond the few student-level variables you initially listed in your question.  I would argue that some factors are at the institution level and even higher if you wish to test for a ¡§causal model.¡¨  Things like cuts in the number of sections, the upturn in the local employment picture, new enrollment procedures/rules, tuition changes, high school closures, exit exams, etcetera may contribute to a decline in enrollment (especially fairly abrupt declines).  This may justify the analysis of a broader data set (that is, the use of enrollments at other colleges).

Another issue is the need to test for lagged effects for influential events (such as the increase in tuition and changes in employment prospects).  So recently observed declines may reflect the influence of events in prior months (or prior terms).  Yes, these ideas will make your quest more complex; it¡¦s just some food for thought if you really had to pinpoint causal factors.

 Another statistical method to consider would be an event history analysis (or survival analysis).  At the AIR Forum in Chicago, I saw a nice study by Xiao Ying Zhang, Bill Grimes, Lijuan Zhai, and Cindy Wijma (all of San Diego CCD).  Although it focused on the issue of academic probation (a binary dependent variable), it illustrated another approach (specifically the Cox regression flavor of event history analysis) to analyzing covariates (possible moderators and causal factors) to understand some phenomenon.  Of course, their use of event history analysis made more sense to me because I had just finished taking a really good workshop on event history analysis taught by another OCAIR colleague (Chau-Kuang Chen).  In your situation, event history analysis may help because you probably have ¡§censored¡¨ data (and this is the statistical term, not the political one).

Anyway, good luck on the analysis¡K


Michael Tamada:

The responses have been very good, here's a follow-up to Willard's response.
 
One piece of good news is that SPSS does survival analysis, including Cox regression (I believe it's in the Advanced module). 

One piece of potential bad news is that it seems to me that attrition is not a permanent condition -- students can and do return.  And leave again.  So probably one should use a "multiple spells" model; these are more complicated and I don't think SPSS can run them.

However Lan was chiefly interested in looking just at Spring 2006 enrollment or attrition; with just one term to look at, survival analysis is pretty much moot; a student is either here or gone and there's only one term to look at.
 


Chau-Kuang Chen:

To: Tamada, Michael; Hom, Willard; Hongmei Zhu; Lan Hao

Your responses are extremely valuable to our knowledge base.

From what I have learned, survival analysis allows researchers to study the timing and duration of the critical event, time (continuous or discrete) to event occurrence (binary).

For continuous time variable, researchers can use standard Cox regression, stratified Cox regression, extended Cox regression, and so forth.

For discrete time variable, researchers can use binary logistic regression with dummy time variable. Multiple spells model is an example of discrete time survival analysis. Here is an excellent paper, Singer, J.D. & Willett, J.B. (1993). It's about time: Using discrete time survival analysis to study the duration and the timing of events. The Journal of Educational Statistics, 18, 115-l95.

For competing and/or recursive event survival data, there is an excellent book by Paul D. Allison  titled Survival Analysis Using the SAS System: A  Practical Guide

For Lan's project, I wonder if we can break down time variable, one term, into multiple months (time origin = beginning of the term; endpoint = the critical event of interest (enrolled/attrited); or other events unrelated to enrolled/attrited (censored); and end of the study period = end of the term). Thus, we can detect at what month students are likely to experience the critical event (enrolled/attrited).

I like Willard's idea to examine the lagged effects of independent variables on enrollment because it has justification of theory. For instance, the economy (e.g., changes in employment prospect) rarely changes its patterns drastically within one or two years unless wars or political violence occur.