Analyzing Employee Database Part 1: Build and analyze database with PostgreSQL

Project Overview and Purpose

The company was worried about its employees who were going to go to the retirement and needed to know:

  1. Who will retire in the next few years and who will be eligible for the retirement package?

  2. How many positions will the company need to fill?

The company used Excel to work with data and decided to upgrade methods and use SQL instead. So, we had 6 CSV files with the data at the beginning, and we needed to build a new employee database with SQL.

Projects Goals:

Create ERD, perform data modeling, and complete analysis on an employee database using SQL techniques.

Projects Tasks:

  • Design entity relationship diagrams (ERDs) that will apply to the data.

  • Create an SQL database in PgAdmin.

  • Import data into a database: Import and export CSV datasets into PgAdmin.

  • Create new tables, using different joins.

  • Cleaning the data.

  • Create queries that use data to answer questions.

  • Perform analysis of the employee database.

Results

Part 1: Create an ERD Diagram

An entity relationship diagram was created using quick database diagrams to gather and organize the key elements from various data tables. The data was imported using Postgres and the pgAdmin interface. SQL queries were written to create data tables by joining primary keys from different data sets together and establishing foreign keys. SQL queries were created to generate a list of retiring employees by title and a list of employees eligible for mentorship.

Part 2: The Number of Retiring Employees by Title

Create a Retirement Titles table

A Retirement Titles table was made that holds all the titles of employees who were born between January 1, 1952, and December 31, 1955.

Code:

Retirement Table:

The Retirement Titles table was exported as retirement_titles.csv

Remove duplicates

There were many duplicate entries in the Retirement Titles table because some employees had multiple titles in the database.

Using the distinct on SQL script, the duplicates were removed leaving the most recent job title to create a unique list of retiring employees.

Code:

Unique Retirement Table:

The Unique Titles table was exported as unique_titles.csv

Create a Retiring Titles table

A query was written and executed to create a Retiring Titles table that contains the number of titles filled by employees who are retiring.

Then a Retiring Titles table was created.

Code:

Retiring Titles Table:

The Retiring Titles table is exported as retiring_titles.csv

Part 3: The Employees Eligible for the Mentorship Program

Using the ERD as a reference, a mentorship-eligibility table was created that holds the current employees who were born between January 1, 1965 and December 31, 1965.

Create a Mentorship-eligibility table

Code:

Table:

The Mentorship Eligibility table is exported and saved as mentorship_eligibilty.csv

Part 4: A written report on the employee database analysis

The purpose of this analysis was to determine the number of retiring employees per title and identify employees who are eligible to participate in a mentorship program. The report helped management prepare for the “silver tsunami” as many current employees reach retirement age.

Results:

  1. From the finding of the eligible retirees, a high percentage of the workforce could retire at any given time.

  2. From the job titles of the eligible retirees, the breakdown is below:

29,414 Senior Engineer 28,254 Senior Staff 14,222 Engineer 12,243 Staff 4,502 Technique Leader 1,761 Assistant Engineer 2 Manager

Summary:

How many roles will need to be filled as the "silver tsunami" begins to make an impact?

90,398 vacancies are in urgent need to be filled out as soon as the workforce starts retiring at any given time.

Are there enough qualified, retirement-ready employees in the departments to mentor the next generation of employees?

There are 1,549 employees eligible for the mentorship program. 90,398 vacancies will need to be filled. So the gap is significant. It looks like the company does not have enough retirees to mentor the next generation of employees. As a next step, it would be useful to have a query that indicates a list of employees who will be retired for each following year. It could help the company to prepare better for the "silver tsunami".