Project: HR Data Automation & Attrition Risk Modeling - using Excel

Role: Data Analyst

Tools Used: Advanced Microsoft Excel

Functions Demonstrated: XLOOKUP, TEXTBEFORE, TEXTAFTER, Nested IF/AND/OR, Data Validation, Transpose, Absolute Referencing


The Objective

To clean, consolidate, and analyze fragmented HR datasets and Training Needs Analysis (TNA) survey responses for over 1,100 employees. The goal was to eliminate manual cross-referencing by building an automated, formula-driven dashboard in Excel capable of flagging employees as "Flight Risks" based on multi-variable business constraints.

The Challenge and My Approach

The raw data was unstructured, featuring unextracted text (emails) and disconnected demographic tables. I utilized advanced Excel text manipulation to clean the data, dynamic lookup functions to integrate multiple tables, and complex conditional logic to translate raw survey scores into a functional HR dashboard.


Key Contribution and Technical Execution

1. Text Manipulation & Feature Extraction (Email List)

String Parsing: To determine which branch an employee worked for, I engineered a modern text extraction formula using TEXTBEFORE and TEXTAFTER. This dynamically pulled the "Company" name directly out of the raw email addresses (e.g., instantly extracting "gamma" from [email protected].

2. Advanced Lookups & Data Integration

Automated Retrieval: Rather than manually matching employee records, I utilized XLOOKUP to automatically pull Age and Salary data from the consolidated TNA sheet into the email list, using the Employee ID as the lookup value. This ensured data integrity across 1,100+ rows.

Relational Data Bridging: To retrieve the "Join Date" for the dashboard, I encountered a schema mismatch: the search input was an Email Address, but the target TNA table only contained Employee IDs. I engineered a nested XLOOKUP to dynamically fetch the ID from the Email List and feed it directly into the outer lookup, seamlessly bridging the two datasets without creating redundant helper columns.

3. Data Reshaping & Dynamic Matrix Building (Transposed Table)