Harnessing your business' data
Table of Contents
1. Intro
If you have an E-commerce business, chances are that you have precious dataset that big marketing platforms don't.
This data can be used to launch a successful, well-targeted marketing campaign.
2. Plan
If you have 10000 customers; you can either, craft an advertisement for each customer, or, you can group customers that are alike and the craft an advertisement for each group; so instead of creating 10000 ads, you'll craft four or eight.
How would we group customers?
One of the models that are used for this purpose is the RFM model, it stands for:
Recency: Grouping by when was the customer last order.
Examples:
Note: To simplify the examples, H means high value and L means low value.
Customer ID Recency Recency grouping 68426 12 H 14628 45 H 56144 167 L 94247 378 L Frequency: Grouping by how many orders a customer has purchased.
Customer ID Frequency Frequency grouping 68426 1 L 14628 3 H 56144 2 H 94247 1 L Monetary value: Grouping by how much money a customer did spend.
Customer ID Monetary value Monetary value grouping 68426 20 L 14628 90 H 56144 40 L 94247 120 H
We will have a case study below further discussing what we can do with the grouping/clustering.
How we determine if value is high or low?
As long as our methods rely on statistic and common sense, we can chose multiple rankings and not just high and low.
But, for a start we can use the mean \(\mu\) as a reference point.
Example: a customer with a value that is above mean if the variable can be considered high(H).
We will mimic the e-commerce dataset with this public one Online Retail II by Daqing Chen.
We will be cleaning and manipulating the data to prepare it for the RFM model using logarithmic \(\log\) and normalization \(\frac{x-\mu}{\sigma} transformations\)
3. Result
The final data that we will be used to make marketing decision can be found Link.
Note: As it is stated above, all the data got scaled using logarithmic and normalize transformations.
Note: There are other methods that can be used like the K-means, but, we will be using the mean \(\mu\) as cluster point for easy explanation.
- Variables in the data:
- Customer_ID: Customer ID
- Time_since_last_invoice: Same as recency (When was the last time a customer ordered?)
- Invoice_Count: Same as frequency (How many times a customer did ordered?)
- Total_Spent: Same as monetary value (How much money a customer spent in total?)
- Cluster1: Customers cluster around one variable (Total_spent)
- Cluster2: Customers clutter around two variables (Total_spent and Time_since_last_invoice)
- Cluster3: Customers clutter around three variable (Total_spent, Time_since_last_invoice, and Invoice_Count)
- Customer_ID: Customer ID
3.1. Grouping based on one variable
Let's group customer onto two groups base on how much money they spent.
Down below is a histogram of the variable Total_Spent, after logarithmic and normalize transformation.
Using the mean \(\mu\) as a reference point, we have two groups, we can describe anything above it as high (H) and anything below it as low (L)
How can we interpret the result?
Let's describe each group base on their position relative to the variable and come up with a marketing action.
PS: This is an example, the marketing actions are just for brief explanation.
Group | T_S | Customer description | Action |
---|---|---|---|
H | \(\gt 0\) | High spenders | loyalty gift |
l | \(\leq 0\) | Low spenders | coupon |
What if we were to add another variable to the grouping phase?
3.2. Grouping based on two variables
Here we have a scatter plot of variables Total_Spent and Time_since_last_invoice yielding four customer groups, and beLow is the interpretation.
Group | T_S | T_s_l_i | Customer description | Action |
---|---|---|---|---|
H_H | \(\gt 0\) | \(\gt 0\) | High spenders, recent buyers | loyalty gift |
H_L | \(\gt 0\) | \(\leq 0\) | High spenders, old buyers | catalog of newest product |
L_H | \(\leq 0\) | \(\gt 0\) | Low spenders, recent buyers | coupons for future sales |
L_L | \(\leq 0\) | \(\leq 0\) | Low spenders, old buyers | coupons for newest product |
3.3. Grouping based on three variables
Group | T_S | T_s_l_i | I_C | Customer description | Action |
---|---|---|---|---|---|
H_H_H | \(\gt 0\) | \(\gt 0\) | \(\gt 0\) | High spenders, recent and recurring buyers | loyalty gift |
H_H_L | \(\gt 0\) | \(\gt 0\) | \(\leq 0\) | High spenders, recent and one-off buyers | special bundle + feedback |
H_L_H | \(\gt 0\) | \(\leq 0\) | \(\gt 0\) | High spenders, old and recurring buyers | updates of new products |
L_H_H | \(\leq 0\) | \(\gt 0\) | \(\gt 0\) | Low spenders, recent and recurring buyers | coupons + loyalty gift |
H_L_L | \(\gt 0\) | \(\leq 0\) | \(\leq 0\) | High spenders, old and one-off buyers | updates + special bundle |
L_H_L | \(\leq 0\) | \(\gt 0\) | \(\leq 0\) | Low spenders, recent and one-off buyers | coupons |
L_L_H | \(\leq 0\) | \(\leq 0\) | \(\gt 0\) | Low spenders, old and recurring buyers | updates +coupons |
L_L_L | \(\leq 0\) | \(\leq 0\) | \(\leq 0\) | Low spenders, old and one-off buyers | ask for feedback |
PS: Depending on the business' products price range, customers belonging to the group H_H_L and H_L_L can be more valuable than those belonging to H_H_H.
Example:
Assuming your business sells' two products, one is sold at $10 and the other at $1000; and giving that:
- H_H_H means Customers who spent a lot, just bought recently, and bought multiple times.
- H_H_L means Customers who spent a lot, just bought recently, and bought fewer times.
- H_H_L means Customers who spent a lot, bought recently long ago, and bought fewer times.
H_H_L and H_L_L groups bought fewer times, yet they were able to rank among the top spenders, it means that they bought the expensive product.
While group H_H_H, who bought multiple times, were able to rank among the top spenders because they bought the cheap product multiple times.
4. Final thoughts
Can we use more than variables in the grouping phase?
Yes
Can we have multiple ranks?
Yes, instead of H and L, we can have 1, 2, and 3 up to 10 if you desire.
Using the K-means algorithm, there is a method called the elbow method, it determines what is the optimal number of rank to use in a giving data set.