Written By : Henry Wong

Edited By : Patrick Dillion

Every school year, the University of Missouri (UM) System publishes the wages of almost everyone on their payroll as required due to the data being Public Record. Since our blog is based in Missouri S&T, I decided to take a look into what tax payer money and our tuition is being put towards.

The data I am using can be found here.

Columns

BUSINESS_UNIT: The name of the specific business the employee belongs to, in this case it’s usually which school. There will be instances where the report groups different locations. For example, the Vice Chancellor and the Financial Operations Department is merged into one catagory.

Name: Simply the name of the person employed. Follows the format of “last_name”,”first_name” “middle_initial”. For example: “Wong, Henry C”.

DEPTNAME: The name of the specific department or role of the employee.

JOBTITLE: The specific job title of the employee. For example a professor will be under “PROFESSOR”, “PROF, AST” (Assistant Professor), “PROF, AST ADJ” (Assistant Adjunct Professor), etc.

RATE1: The 2017-2018 School Year Salary for that employee. One of the biggest flaws of this data set is that it doesn’t specify whether the number listed yearly or hourly wages. Thus, the W2 record is better guage to see how much the employee is actually making.

W2AMT-20172: What was filed in the W2 form for that specfic employee based on their earnings in 2017. The W2 form details almost all the wages paid to the employee.

SCHOOL: The school that the employee is currently located. For this analysis it’s all going to be “MST”.

Notes

There is a trend on this notebook that when looking at monetary values I will use “RATE1” first and then “W2AMT-20172”. I will try to label them throughout the chart but please keep in my this notebook is structured like so.

%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.plotly as py
dataFrame = pd.read_csv("data/UM_Salaries_2017_2018_clean.csv")

Now Let’s Limit to just Missouri S&T Data

rollaData = dataFrame.loc[dataFrame['SCHOOL'].isin(['MST'])]

Highest Salaried Employee at Missouri S&T for the 2017-2018 school year

rollaData['RATE1'].idxmax()
21741
dataFrame.iloc[[21741]]
BUSINESS_UNIT NAME DEPTNAME JOBTITLE RATE1 W2AMT-20172 SCHOOL
21741 ROLLA Wlezien,Richard Walter Dean Engineering & Computing DEAN 315000.0 111898.05 MST

Highest Paid Person at Missouri S&T in 2017

rollaData['W2AMT-20172'].idxmax()
21071
dataFrame.iloc[[21071]]
BUSINESS_UNIT NAME DEPTNAME JOBTITLE RATE1 W2AMT-20172 SCHOOL
21071 ROLLA Leu,Ming C Mechanical & Aerospace Engin PROFESSOR 222170.0 292824.57 MST

Let’s visualize the mean salary based on Departments

rollaByDept = rollaData.groupby('DEPTNAME',as_index=False)['RATE1','W2AMT-20172'].mean().sort_values(by='RATE1',ascending=False)
top50RollaDeptBySalary = rollaByDept[0:50]
plt.figure(figsize=(15, 15)) # width and height in inches
plt.title("2017-2018 Top 50 Paid Department at Missouri S&T by Mean Salary")
g = sns.barplot(x="RATE1", y="DEPTNAME",data=top50RollaDeptBySalary,palette="Set1")
g.set(xlabel="2017-2018 Salary($USD)",ylabel="Department Name")
[Text(0,0.5,'Department Name'), Text(0.5,0,'2017-2018 Salary($USD)')]

png

Since there are 2 metrics used here that show how much a person makes, we’ll graph by W2 forms as well

rollaByDeptW2 = rollaByDept.sort_values(by="W2AMT-20172",ascending=False)
rollaByDeptW2top50 = rollaByDeptW2[0:50]
plt.figure(figsize=(15, 15)) # width and height in inches
plt.title("2017-2018 Top 50 Paid Department at Missouri S&T by W2 Wages Report")
g = sns.barplot(x="W2AMT-20172", y="DEPTNAME",data=rollaByDeptW2top50,palette="Set1")
g.set(xlabel="2017 W2 Wages Report($USD)",ylabel="Department Name")
[Text(0,0.5,'Department Name'), Text(0.5,0,'2017 W2 Wages Report($USD)')]

png

We’re going to limit to just Rolla Professors here.

PROFESSOR: A tenure Professor

PROF, AST: An Assistant Professor

PROF, AST ADJUNCT: An Assistant Adjunt Professor

PROF, ASCO: An Associate Professor

PROF, AST TEACH: An Assistant Teaching Professor

rollaProfData = rollaData.loc[rollaData['JOBTITLE'].isin(['PROFESSOR','PROF, AST','PROF, AST ADJUNCT','PROF, ASOC','PROF, AST TEACH'])]
rollaDeptProf = rollaProfData.groupby('DEPTNAME',as_index=False)['RATE1','W2AMT-20172'].mean().sort_values(by='RATE1',ascending=False)
plt.figure(figsize=(15, 15)) # width and height in inches
plt.title("2017-2018 Mean Yearly Salary of Missouri S&T of Professors by Department")
g = sns.barplot(x="RATE1", y="DEPTNAME",data=rollaDeptProf,palette="Set1")
g.set(xlabel="2017-2018 Salary($USD)",ylabel="Department Name")
[Text(0,0.5,'Department Name'), Text(0.5,0,'2017-2018 Salary($USD)')]

png

Now by W2 Forms

rollaDeptProf = rollaDeptProf.sort_values('W2AMT-20172',ascending=False)
plt.figure(figsize=(15, 15)) # width and height in inches
plt.title("2017 Mean W2 Wages of Missouri S&T of Professors by Department")
g = sns.barplot(x="W2AMT-20172", y="DEPTNAME",data=rollaDeptProf,palette="Set1")
g.set(xlabel="2017-2018 Salary($USD)",ylabel="Department Name")
[Text(0,0.5,'Department Name'), Text(0.5,0,'2017-2018 Salary($USD)')]

png

Let’s only look at Tenure Professors

rollaTenure = rollaProfData.loc[rollaProfData['JOBTITLE'].isin(['PROFESSOR'])]
rollaProfessorDept = rollaTenure.groupby('DEPTNAME',as_index=False)['RATE1'].mean().sort_values(by='RATE1',ascending=False)
plt.figure(figsize=(15, 15)) # width and height in inches
plt.title("2018 Mean Yearly Salary of Missouri S&T of Tenure Professors by Department")
g = sns.barplot(x="RATE1", y="DEPTNAME",data=rollaProfessorDept,palette="Set1")
g.set(xlabel="2018 Yearly Salary($USD)",ylabel="Department Name")
[Text(0,0.5,'Department Name'), Text(0.5,0,'2018 Yearly Salary($USD)')]

png

By W2 Forms Now

rollaProfessorDeptbyW2 = rollaTenure.groupby('DEPTNAME',as_index=False)['W2AMT-20172'].mean()
rollaProfessorDeptbyW2 = rollaProfessorDeptbyW2.sort_values(by='W2AMT-20172',ascending=False)
plt.figure(figsize=(15, 15)) # width and height in inches
plt.title("2017 Mean W2 Wages Report of Missouri S&T of Tenure Professors by Department")
g = sns.barplot(x="W2AMT-20172", y="DEPTNAME",data=rollaProfessorDeptbyW2,palette="Set1")
g.set(xlabel="W2AMT-2017($USD)",ylabel="Department Name")

[Text(0,0.5,'Department Name'), Text(0.5,0,'W2AMT-2017($USD)')]

png