Instacart is an online grocery store that operates through an app. Instacart already has very good sales, but more information about their sales pattern remained uncovered. I would like to perform an initial data and exploratory analysis of some of their data in order to
Data exploration was initially conducted in Python, including data wrangling & subsetting, consistency check, new variables deriving, grouping & aggregating as well as visualization.
The Population Flow chart illustrates the evolution of data sets, represented by the total count of rows.
This is an example of creating a ‘frequency_flag’ to target groups by ordering frequency.
Results were visualized in Python and included in a final report - check my Github.
1) Age at 40 was a 'watershed' with regard to the income (left-top).
2) The department popularity: 'produce', 'dairy eggs' and 'snacks' were top 3 departments in each region (right).
3) High-and middle-income customers with dependants had high sales independent of age groups (left-bottom).