top of page
Search

Project: Predictive Analytics Capstone

  • Writer: Jackie Wang
    Jackie Wang
  • Aug 19, 2020
  • 2 min read

Updated: Aug 21, 2020



In this case, I used multiple analytical techniques to provide recommendations on where and how a grocery store chain should expand. This is a project from Udacity - Predictive Analytics for Business program.


Task One: Determine Store Formats for Existing Stores


Here, I prepare an analytical dataset based on available datasets (storeinformation.xls, storesalesdata, storedemographicdata). After changing data type, filtering data, and creating new variables (Total Sales Per Store and Percentage Sales Per Catagory Per Store), a new analytical dataset was prepared through the following workflow.


I used variable Percentage Sales Per Category Per Store for Clustering in a K-Means Clustering model. Using the highest median and compact spread of the Rand and CH indices, 3-clusters is the most optimal method.


In Tableau, we drag Zip Code to detail in marks, Latitude (generated) and Longitude

(generated) in columns and rows field. The visualization below shows the locations of the

stores, cluster in colors, and total sales in size. We can see all stores are located in the state

of California in three clusters.

Task Two: Formats for New Store


I joined the datasets, and used all variables to train three models (Decision Tree Model, Forest Model and Boosted Model). I created estimation samples (80%) and validation samples (20%).

Based on Model Comparison Report, we choose Boosted Model to predict the best store formats for the new stores.



After filtering out the 10 new stores's data from the dataset storedemographicdata, I scored the Boosted Model and determine what format each of the 10 new stores fall into.

Task Three: Predicting Produce Sales


I used the first 40 monthly data to train the ETS and ARIMA models, and holdout 6 monthly data to validate models. After comparison of Time Series Models, ETS model outperforms ARIMA model in terms of all measures. ETS (M,N,M) was selelcted. After running the following workflow, the Predicted Produce Sales of all new stores for 12 periods in the future was obtained.



I also used ETS (M,N,M) model and TS Forecast Tool to predict the next 12 months produce sales for the existing store. In the end, please take a look at data visualization in Tableau.



Thank you!

 
 
 

Comentários


Post: Blog2_Post

2897765769

  • Facebook
  • Twitter

©2020 by Jackie's. Proudly created with Wix.com

bottom of page