Retention and Outcome Analysis

I analyzed 10+ years of student retention data for a university program to understand what drives drop-off.

The tools: R, Tableau, and logistic regression.

The takeaway? ๐—ฌ๐—ผ๐˜‚ ๐—ฐ๐—ฎ๐—ปโ€™๐˜ ๐—ถ๐—บ๐—ฝ๐—ฟ๐—ผ๐˜ƒ๐—ฒ ๐—ฟ๐—ฒ๐˜๐—ฒ๐—ป๐˜๐—ถ๐—ผ๐—ป ๐—ถ๐—ณ ๐˜†๐—ผ๐˜‚ ๐—ฑ๐—ผ๐—ปโ€™๐˜ ๐˜€๐—ฒ๐—ด๐—บ๐—ฒ๐—ป๐˜ ๐˜†๐—ผ๐˜‚๐—ฟ ๐—ฎ๐˜‚๐—ฑ๐—ถ๐—ฒ๐—ป๐—ฐ๐—ฒ.

Project Snapshot

Client: A university Digital Humanities program
Goal: Understand which student groups were most and least likely to complete a competitive minor, and identify opportunities to improve outcomes
Full Report
Code Repository

The Challenge

The client suspected some students were struggling to complete the program but lacked the data-driven insights to pinpoint which groups were most affected or why. With limited resources, they needed to identify high-impact opportunities for improvement.

Our Approach

We conducted an end-to-end analytics project, including:

  • Data cleaning and integration across multiple sources
  • Exploratory analysis of demographic trends
  • Statistical modeling (logistic regression) to evaluate predictors of program completion
  • Visualization of both aggregate outcomes and predictive insights to support decision-making

Importantly, we analyzed not just individual demographic factors but also how they interactโ€”surfacing intersectional risks and successes often missed in surface-level analysis.

The dashboard below is interactive so the data can be sliced in many different ways and viewed for different periods of time using the slider in the lower right corner.

Key Findings

Once I grouped by demographics and time constraints, hidden risks appeared โ€” just like in customer journeys.

  • Overall completion rates were strong (77%), but certain subgroups lagged behindโ€”particularly students who were URM, female, and not first-gen.
  • First-generation students completed at higher rates than their continuing-gen peers.
  • A statistically significant interaction revealed that URM male students, despite initial assumptions, had some of the highest predicted completion probabilitiesโ€”suggesting effective, if informal, support mechanisms.

The same techniques apply to business:

โžก๏ธ ๐—ช๐—ฎ๐—ป๐˜ ๐˜๐—ผ ๐—ฟ๐—ฒ๐—ฑ๐˜‚๐—ฐ๐—ฒ ๐—ฐ๐˜‚๐˜€๐˜๐—ผ๐—บ๐—ฒ๐—ฟ ๐—ฐ๐—ต๐˜‚๐—ฟ๐—ป? โ†’ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น ๐—ฑ๐—ฟ๐—ผ๐—ฝ-๐—ผ๐—ณ๐—ณ ๐—ฟ๐—ถ๐˜€๐—ธ.
โžก๏ธ ๐—ช๐—ฎ๐—ป๐˜ ๐˜๐—ผ ๐—ด๐—ฟ๐—ผ๐˜„ ๐—–๐˜‚๐˜€๐˜๐—ผ๐—บ๐—ฒ๐—ฟ ๐—Ÿ๐—ถ๐—ณ๐—ฒ๐˜๐—ถ๐—บ๐—ฒ ๐—ฉ๐—ฎ๐—น๐˜‚๐—ฒ (๐—–๐—Ÿ๐—ฉ)? โ†’ ๐—ฆ๐—ฒ๐—ด๐—บ๐—ฒ๐—ป๐˜ ๐—ฏ๐˜† ๐—ฏ๐—ฒ๐—ต๐—ฎ๐˜ƒ๐—ถ๐—ผ๐—ฟ, ๐—ป๐—ผ๐˜ ๐—ท๐˜‚๐˜€๐˜ ๐—ฝ๐—ฟ๐—ผ๐—ณ๐—ถ๐—น๐—ฒ.

Impact & Next Steps

The program now has clear, data-driven direction for improving student success. Based on our recommendations, they are:

  • Prioritizing early outreach to at-risk groups
  • Investigating whatโ€™s working for URM male students to scale support more broadly
  • Monitoring program equity with newly developed KPIs

Why It Matters

This project demonstrates how looking beyond averages and digging into overlapping identities can unlock powerful insights. It also proves that predictive analytics can reveal unexpected strengths and help programs (or businesses) focus resources where they matter most.

Services Provided

  • Data strategy and wrangling
  • Predictive analytics (logistic regression, interaction effects)
  • Custom visualization and reporting
  • Strategic recommendations based on quantitative findings

Predictive analytics isnโ€™t just for academia.

Weโ€™ll help you find whatโ€™s working, fix whatโ€™s not, and move forward with confidence.