Employee Churn Prediction & Risk Analysis

This project focuses on predicting employee churn, the likelihood of employees leaving the company. Employee attrition leads to costly recruitment, productivity loss, and weakened team morale. Using HR data (~15,000 employees), I developed a machine learning model that identifies at-risk employees and provides actionable insights to guide retention strategies.

The goal: enable HR teams to act proactively by predicting churn before it happens.

Importance of Employee Churn Prediction

High employee turnover negatively impacts both financial and cultural aspects of a company. By predicting churn, organizations can:

  • Reduce costs: Save on recruitment, onboarding, and training expenses.

  • Improve productivity: Retain skilled employees to ensure continuity.

  • Target interventions: Focus retention efforts on at-risk employees.

  • Strengthen morale: Improve engagement by addressing dissatisfaction drivers.

Project Objectives

The primary objective of this project is to build a machine learning model that predicts whether an employee will leave the organization. Specifically, the project aims to:

  • Predict employee churn based on HR features (satisfaction, tenure, workload, salary, promotions, etc.).

  • Identify key factors driving churn to inform HR policies.

  • Provide department-level insights to highlight churn hotspots.

  • Deliver dashboards and reports for proactive HR decision-making.

Methodology

Dataset: HR data of ~15,000 employees, including:

  • Satisfaction level, last evaluation, number of projects, average monthly hours, tenure.

  • Salary level, promotions, work accidents, and departmental assignment.

Steps:

  1. Data Cleaning & Exploration: Removed duplicates, explored churn trends by department and tenure.

  2. Feature Engineering: Encoded categorical variables (salary, department).

  3. Model Development: Evaluated multiple algorithms—Logistic Regression, Decision Trees, Random Forest, XGBoost.

  4. Evaluation: Compared models on Accuracy, AUC, Precision, Recall, F1-score.

  5. Deployment: Exported predictions to BigQuery for integration with HR dashboards.

Best Performing Model:

  • Random Forest Classifier

  • Accuracy: 93% on test data

  • AUC: 0.90

  • Key limitation: Recall was low on imbalanced pilot data (further tuning needed).

Impact and Applications

The churn prediction model enables HR to:

  • Identify at-risk employees before they resign.

  • Pinpoint drivers of churn, such as low satisfaction, long tenure without promotion, and workload imbalance.

  • Target interventions like training, promotions, workload redistribution, or pay adjustments.

  • Guide strategy: Direct retention resources to high-risk departments (e.g., Sales, Technical, Support).

Business Value: Early intervention could reduce attrition by 15–20%, saving significant recruitment costs and improving overall workforce morale.

Key Insights

  • Top churn driver: Job satisfaction.

  • Other strong predictors: Tenure, number of projects, and average monthly hours.

  • Lower impact: Workplace accidents, salary extremes, and some departmental assignments.

This project demonstrates how data science can transform HR strategy from reactive to proactive. Predictive churn analytics empowers companies to retain top talent, lower costs, and maintain a strong organizational culture.

Description

  • Personal Project

  • GitHub

  • 19.07.2025

Employee attrition is costly, leading to increased recruitment expenses, productivity loss, and lower team morale. This project uses AutoML with PyCaret to predict employee churn and identify the key drivers of attrition. By analyzing HR data, the model highlights risk factors such as satisfaction, tenure, workload, and promotions, providing HR with actionable insights on Looker Studio to proactively address retention challenges and improve employee engagement.