Description

Part Time (1 Year)

Supervised by: Prof. Dacheng Xiu (Professor of Econometrics and Statistics)

Zillow Offers: Financial Intermediation in the Real Estate Market via Machine Learning

  • Used the Zillow Kaggle Competition Dataset to preprocess Real Estate Pricing Error data in Python.
  • Made exploratory and statistical data analysis for reports.
  • Developed supervised machine learning regression models: linear, regularized, and non-parametric to predict pricing error; evaluated and contrasted the models using standard machine learning practices.
  • Detailed systematic factors of data contributing to the space of pricing error for real estate markets like location, tax variables, and house fundamentals.
  • Created Assignments and Lecture Content from material studied in the project, with expositions and explanations of various machine learning theory topics for MBA students (eg. k-fold cross validation).

LendingClub: P2P Credit Default using Machine Learning

  • Used the public LendingClub dataset to preprocess Real Estate Pricing Error data in Python.
  • Made exploratory and statistical data analysis for reports.
  • Developed supervised machine learning classification models: linear, regularized, and non-parametric to predict default rate; evaluated and contrasted the models using standard machine learning practices.
  • Detailed systematic factors of data contributing to the space of default rate and probability.
  • Created Assignments and Lecture Content from material studied in the project, with expositions and explanations of various machine learning theory topics for MBA students (eg. classification metrics).

Knowledge

  • Research
  • Statistical and Exploratory Data Analysis
  • Machine Learning
  • Linear Regression
  • Logistic Regression
  • LASSO and Ridge Regression
  • Decision Trees
  • Bagging Theory and Random Forests
  • Boosting Theory and XGBoost
  • Model Evaluation

Skills

  • Python
  • NumPy
  • Pandas
  • Scikit-Learn