Which Test or Analysis for Finding Factors That Relate to Retention?
6/8/2006
Lan Hao:
My initial question
I have a project that I am working on and would really appreciate your help.
At the community college that I am temporarily working for, we had a problem of low
enrollment for Spring 06. We want to find out why.
Now I have pulled all the students who enrolled at Fall 05 (about 14,000), and about
8,000 continued into Spring 06, and the other 6,000 did not enroll in Spring 06. (There
are another 5,000 new students in Spring 06, but they are not of my interest here)
I have grades, units taken, financial aid info etc in Fall 05 for both groups of students
(continued and not continued)
The project question I am trying to answer is: Is there any relationship between grades,
units, financial aid etc (all these independent variables) and the decision to continue
to enroll in Spring 06 or not (the yes and no dependent variable)?
Having taken my stats classes many years ago, my questions to our fellow statisticians
and SPSS gurus are:
1. Which test should I use to do the analysis?
2. The practical way of doing it in SPSS.
Thank you so much.
Sincerely, Lan
Liu Xia:
First, make sure what you pulled is representative sample from the population.
Try logistic regression to see whether the independent variable can
explains. Since the response variable is binary, so use SPSS: Analysis->Regression->binary
logistic.
Good luck.
Andrew Som:
Do you have access to National Student Clearinghouse. It can quickly tell
you if your students are transferring out and where.
http://www.studentclearinghouse.org/colleges/Tracker/default.htm
Good Luck
Allen Joseph Medwick:
Are you thinking of using logistic regression?
Ava Lee-Pang:
I was working in the Instructional Office for over 16 years as a specialist.
What I found was that the enrollment, maybe most likely be more related to course
offerings (transferable courses, courses to complete certificates and degrees), course
offering patterns (i.e.: can a student take 12 units each term and complete the certificate
and/or degree within the time one/two years?), class time patterns (i.e.: 9-10 MTWRF/MWF,
10-11 MTWRF/MWF, 11-12 MTWRF/MWF).
Of course, financial aid will also be the factor.
Then the next factor will be student services (i.e. outreach, retention, tutoring,
etc.)
Hope these will help.
Brian Hu:
Normally you can combine the two groups together and create a flag called ¡§continue
enroll¡¨ with the value ¡§1¡¨ = continued and ¡§0¡¨ = dropped. Use logistic regression
on this dichotomy variable as dependent variable and get the results for all significant
independent variables. Read some text about logistic regression, if necessary.
Hope this helps.
Hongmei Zhu:
I did similar project last year try to find out factors to students retention rate,
besides grade and financial aid information, I also used a lot of student¡¦s characteristic
variables like gender, ethnic, program they are studying, age, and their engagement
in campus activities¡K.all the factor in your institution you think might affect their
decision to drop out.
The method I used to analysis is first do Chi-square test for each variable, see if
there is significant different between those drops and continues, then put all the
significant variables to fit a logistic regress model.
I don¡¦t have SPSS in my computer, I did all the analysis in SAS, but I think you
can try it in SPSS too.
Good luck!
Willard Hom:
As other people have suggested, the list of predictors (causal factors?) for continued
enrollment go beyond the few student-level variables you initially listed in your
question. I would argue that some factors are at the institution level and even
higher if you wish to test for a ¡§causal model.¡¨ Things like cuts in the number
of sections, the upturn in the local employment picture, new enrollment procedures/rules,
tuition changes, high school closures, exit exams, etcetera may contribute to a decline
in enrollment (especially fairly abrupt declines). This may justify the analysis
of a broader data set (that is, the use of enrollments at other colleges).
Another issue is the need to test for lagged effects for influential events
(such as the increase in tuition and changes in employment prospects). So recently
observed declines may reflect the influence of events in prior months (or prior terms).
Yes, these ideas will make your quest more complex; it¡¦s just some food for thought
if you really had to pinpoint causal factors.
Another statistical method to consider would be an event history analysis (or
survival analysis). At the AIR Forum in Chicago, I saw a nice study by Xiao
Ying Zhang, Bill Grimes, Lijuan Zhai, and Cindy Wijma (all of San Diego CCD).
Although it focused on the issue of academic probation (a binary dependent variable),
it illustrated another approach (specifically the Cox regression flavor of event history
analysis) to analyzing covariates (possible moderators and causal factors) to understand
some phenomenon. Of course, their use of event history analysis made more sense
to me because I had just finished taking a really good workshop on event history analysis
taught by another OCAIR colleague (Chau-Kuang Chen). In your situation, event
history analysis may help because you probably have ¡§censored¡¨ data (and this is
the statistical term, not the political one).
Anyway, good luck on the analysis¡K
Michael Tamada:
The responses have been very good, here's a follow-up to Willard's response.
One piece of good news is that SPSS does survival analysis, including Cox regression
(I believe it's in the Advanced module).
One piece of potential bad news is that it seems to me that attrition is not a permanent
condition -- students can and do return. And leave again. So probably
one should use a "multiple spells" model; these are more complicated and I don't think
SPSS can run them.
However Lan was chiefly interested in looking just at Spring 2006 enrollment or attrition;
with just one term to look at, survival analysis is pretty much moot; a student is
either here or gone and there's only one term to look at.
Chau-Kuang Chen:
To: Tamada, Michael; Hom, Willard; Hongmei Zhu; Lan Hao
Your responses are extremely valuable to our knowledge base.
From what I have learned, survival analysis allows researchers to study the timing
and duration of the critical event, time (continuous or discrete) to event occurrence
(binary).
For continuous time variable, researchers can use standard Cox regression, stratified
Cox regression, extended Cox regression, and so forth.
For discrete time variable, researchers
can use binary logistic regression with dummy time variable. Multiple spells model
is an example of discrete time survival analysis. Here is an excellent paper, Singer,
J.D. & Willett, J.B. (1993). It's about time: Using discrete time survival analysis
to study the duration and the timing of events. The Journal of Educational Statistics,
18, 115-l95.
For competing and/or recursive event survival
data, there is an excellent book by Paul D. Allison titled Survival Analysis
Using the SAS System: A Practical Guide
For Lan's project, I wonder if we can break
down time variable, one term, into multiple months (time origin = beginning of the
term; endpoint = the critical event of interest (enrolled/attrited); or other events
unrelated to enrolled/attrited (censored); and end of the study period = end of the
term). Thus, we can detect at what month students are likely to experience the critical
event (enrolled/attrited).
I like Willard's idea to examine the lagged
effects of independent variables on enrollment because it has justification of theory.
For instance, the economy (e.g., changes in employment prospect) rarely changes its
patterns drastically within one or two years unless wars or political violence occur.