Missouri S&T SIG-Data
Writer: Henry Wong
Contributors: Henry Wong, Erik Shively, Patrick Dillon, David Wood
On April 6th, 2018 all of the SIG-Data team at the time decided to drive 2 hours to Mizzou to compete in a competition called Datafest. There were 4 of us, Computer Scientists roaring to go, as we did as they called out majors. We were 4 of the 5 Computer Science majors competing in the competition and the only team to use Python. This was the first real Data anything the group had done. Noentheless, we still presented. Although we didn’t win anything, it was a great experinece.
The data by Indeed, the job listing search engine, composed of almost 3 GB of job listings from the past few years.
The data was given to in these columns:
Right off the bat we made a heatmap of the United States by Average Estimated Salary of each state:
Scatter Matrix and Educational Level vs. Estimated Salary
To get more ideas we decided to go for a Scatter Matrix, or as we like to call it, the Shotgun Approach. Since we didn’t have 128 GB of memory right in front of us we had to reduce the scatter plot to only Missouri.
Scatter Matrix of Missouri Job Postings on Indeed
Unfortunately, there aren’t many corellations. The only one that stands out notabliy is the word count in the description of a job (descriptionCharacterLength) correlates with number of characters in a descrption (descriptionWordCount).
But we did find something interesting. When looking at educationRequirement vs. estimatedSalary we noticed that the jobs that require No Education make more than jobs that require an High School Education on Indeed.
Estimated Salaries by Education Level Required of Job Posting
Though there are more jobs that require no education, we still believe that we can conclude that jobs that require No Education on Indeed have a higher average salary.
|No Education (0)||80,415||4485567|
|High School Education (1)||50,076||2,351,840|
|Higher Education (2)||47,159||2,715,421|
To do a bit more we decided to look at supervisingJob column.
Salary and whether a listing is a Supervising Job by Education Requirements
To determine if a job was local or not we utilitzed the clicks and localClicks columns. By dividing localClicks by clicks we created a new column called localClicksToAllClicks. We then considered all jobs with a localClicksToAllClicks score more than or equal to .5 to be a local job. And all jobs with a localClicksToAllClicks score less than .5 to be not a local job. For the sake of data, we respresent “1” as a local job and “0” as not a local job.
We decide to put our new column against estimatedSalary to see if salaries would be higher in local jobs or not.
Average Estimated Salary Whether a Job is Local or Not
As seen, the Average Salary is higher for non-local jobs.
With this new column we decided to find out which job field has the most local jobs.
Number of Local Jobs by Job Field
5 Most and 5 Least Local Jobs by Job Field
These two graphs both show the same data. Having the 5 most and least makes the data easier on the eyes. As you see retail tops the chart for local jobs where 26% of retail jobs are local.
Most of these results aren’t that suprising. Retail jobs are more likely to be local jobs while aviation jobs aren’t clicked on by many people in the same area as the job listing is.
- California, Washington, New York, Massachusets, Maryland, and Virgina have the highest average salaries on Indeed
- Within the state for Missouri and the US, jobs that require No Education have a higher average salary than jobs that require a High School Diploma
- Jobs on Indeed that require Higher Education are more likely to be Supervising Job than jobs the require a No Education or a High School Diploma
- Customer Service Jobs are attract more interest locally than jobs like Aviation and Engineering related jobs.