Description
Part Time (1 Year)
Supervised by: Prof. Dacheng Xiu (Professor of Econometrics and Statistics)
Zillow Offers: Financial Intermediation in the Real Estate Market via Machine Learning
- Used the Zillow Kaggle Competition Dataset to preprocess Real Estate Pricing Error data in Python.
- Made exploratory and statistical data analysis for reports.
- Developed supervised machine learning regression models: linear, regularized, and non-parametric to predict pricing error; evaluated and contrasted the models using standard machine learning practices.
- Detailed systematic factors of data contributing to the space of pricing error for real estate markets like location, tax variables, and house fundamentals.
- Created Assignments and Lecture Content from material studied in the project, with expositions and explanations of various machine learning theory topics for MBA students (eg. k-fold cross validation).
LendingClub: P2P Credit Default using Machine Learning
- Used the public LendingClub dataset to preprocess Real Estate Pricing Error data in Python.
- Made exploratory and statistical data analysis for reports.
- Developed supervised machine learning classification models: linear, regularized, and non-parametric to predict default rate; evaluated and contrasted the models using standard machine learning practices.
- Detailed systematic factors of data contributing to the space of default rate and probability.
- Created Assignments and Lecture Content from material studied in the project, with expositions and explanations of various machine learning theory topics for MBA students (eg. classification metrics).
Knowledge
- Research
- Statistical and Exploratory Data Analysis
- Machine Learning
- Linear Regression
- Logistic Regression
- LASSO and Ridge Regression
- Decision Trees
- Bagging Theory and Random Forests
- Boosting Theory and XGBoost
- Model Evaluation
Skills
- Python
- NumPy
- Pandas
- Scikit-Learn