MS Applied Statistics (Concentration: Machine Learning) Aug 2016 - Dec 2018
BA Economics (Minor: Mathematics) Aug 2010 - May 2015
Skills R: Tidyverse, ggplot, CARET, nnet Python: Scikit-Learn, Pandas, NumPy/SciPy, Jupyter Notebooks SQL: MySQL, Oracle SQL Microsoft Excel: Functions, pivots, visualizations BI Tools: Business Objects, Power BI Predictive Modeling: Supervised and unsupervised machine learning, statistical modeling Work Experience
Business Analyst Aug 2016 | Aug 2018
• is one of the nation's top mortgage lenders. Wrote SQL queries, wrangled data, and designed BI reports to address business needs • Incorporated fuzzy matching techniques in R to assist executives in merging spreadsheets with imperfect keys, previously an inconsistent and manual process • Recognized for utilizing innovative SQL queries (nested select’s, groupby, max/min, rank) to tackle difficult data requests • Designed an automated report that ranked zip codes with a high proportion of late payees, frequently implemented for outreach programs to reduce late payments
Data Analyst May 2015 | Aug 2016
• is a tech start-up that functions similar to Angie’s List. Primary role involved data wrangling and data analysis with Excel to pinpoint business inefficiencies • Parsed and visualized customer complaints using Python and Excel to determine common issues per month, which were addressed and reduced for the following month • Incorporated a state map in Power BI with contractor geolocations. This was used to determine closest contractors to new orders and improved on-time percentages by 20% Projects Predicting Drug Rehab Success Python Classified drug rehabilitation patient outcomes (best model: neural network, 80% accuracy.) Improved model performance with dimensionality reduction, feature engineering, and missing value handling Predicting Song Genre R Scraped lyrics and song info from Genius.com and classified songs into genres using term document matrixes and engineered features (text mining, sentiment analysis.) Best model: random forest, 65% accuracy Tour Guide Customer Classification Python NDA Classified potential returning customers on an imbalanced and confidential travel tour dataset (best model: neural network, 70% sensitivity) Predicting Crime in Chicago R, SAS Predicted the probability of different crimes occurring in various Chicago neighborhoods using multinomial logistic regression Certifications MIT 6.00.1x: Introduction to Computer Science using Python edX Certified Predictive Modeler Using SAS Enterprise Miner 14 SAS