Harnessing your business' data

1. Intro
2. Plan
3. Result
4. Final thoughts

1. Intro

If you have an E-commerce business, chances are that you have precious dataset that big marketing platforms don't.
This data can be used to launch a successful, well-targeted marketing campaign.

2. Plan

If you have 10000 customers; you can either, craft an advertisement for each customer, or, you can group customers that are alike and the craft an advertisement for each group; so instead of creating 10000 ads, you'll craft four or eight.

How would we group customers?

One of the models that are used for this purpose is the RFM model, it stands for:

Recency: Grouping by when was the customer last order.
Examples:
Note: To simplify the examples, H means high value and L means low value.

Customer ID Recency Recency grouping

68426 12 H

14628 45 H

56144 167 L

94247 378 L
Frequency: Grouping by how many orders a customer has purchased.

Customer ID Frequency Frequency grouping

68426 1 L

14628 3 H

56144 2 H

94247 1 L
Monetary value: Grouping by how much money a customer did spend.

Customer ID Monetary value Monetary value grouping

68426 20 L

14628 90 H

56144 40 L

94247 120 H

Customer ID	Recency	Recency grouping
68426	12	H
14628	45	H
56144	167	L
94247	378	L

Customer ID	Frequency	Frequency grouping
68426	1	L
14628	3	H
56144	2	H
94247	1	L

Customer ID	Monetary value	Monetary value grouping
68426	20	L
14628	90	H
56144	40	L
94247	120	H

We will have a case study below further discussing what we can do with the grouping/clustering.

How we determine if value is high or low?

As long as our methods rely on statistic and common sense, we can chose multiple rankings and not just high and low.
But, for a start we can use the mean $\mu$ as a reference point.
Example: a customer with a value that is above mean if the variable can be considered high(H).

We will mimic the e-commerce dataset with this public one Online Retail II by Daqing Chen.
We will be cleaning and manipulating the data to prepare it for the RFM model using logarithmic $\log$ and normalization $\frac{x-\mu}{\sigma} transformations$

3. Result

The final data that we will be used to make marketing decision can be found Link.
Note: As it is stated above, all the data got scaled using logarithmic and normalize transformations.
Note: There are other methods that can be used like the K-means, but, we will be using the mean $\mu$ as cluster point for easy explanation.

Variables in the data:
- Customer_ID: Customer ID
- Time_since_last_invoice: Same as recency (When was the last time a customer ordered?)
- Invoice_Count: Same as frequency (How many times a customer did ordered?)
- Total_Spent: Same as monetary value (How much money a customer spent in total?)
- Cluster1: Customers cluster around one variable (Total_spent)
- Cluster2: Customers clutter around two variables (Total_spent and Time_since_last_invoice)
- Cluster3: Customers clutter around three variable (Total_spent, Time_since_last_invoice, and Invoice_Count)

3.1. Grouping based on one variable

Let's group customer onto two groups base on how much money they spent.
Down below is a histogram of the variable Total_Spent, after logarithmic and normalize transformation.
Using the mean $\mu$ as a reference point, we have two groups, we can describe anything above it as high (H) and anything below it as low (L)

How can we interpret the result?

Let's describe each group base on their position relative to the variable and come up with a marketing action.
PS: This is an example, the marketing actions are just for brief explanation.

Group	T_S	Customer description	Action
H	$\gt 0$	High spenders	loyalty gift
l	$\leq 0$	Low spenders	coupon

What if we were to add another variable to the grouping phase?

3.2. Grouping based on two variables

Here we have a scatter plot of variables Total_Spent and Time_since_last_invoice yielding four customer groups, and beLow is the interpretation.

Group	T_S	T_s_l_i	Customer description	Action
H_H	$\gt 0$	$\gt 0$	High spenders, recent buyers	loyalty gift
H_L	$\gt 0$	$\leq 0$	High spenders, old buyers	catalog of newest product
L_H	$\leq 0$	$\gt 0$	Low spenders, recent buyers	coupons for future sales
L_L	$\leq 0$	$\leq 0$	Low spenders, old buyers	coupons for newest product

3.3. Grouping based on three variables

Group	T_S	T_s_l_i	I_C	Customer description	Action
H_H_H	$\gt 0$	$\gt 0$	$\gt 0$	High spenders, recent and recurring buyers	loyalty gift
H_H_L	$\gt 0$	$\gt 0$	$\leq 0$	High spenders, recent and one-off buyers	special bundle + feedback
H_L_H	$\gt 0$	$\leq 0$	$\gt 0$	High spenders, old and recurring buyers	updates of new products
L_H_H	$\leq 0$	$\gt 0$	$\gt 0$	Low spenders, recent and recurring buyers	coupons + loyalty gift
H_L_L	$\gt 0$	$\leq 0$	$\leq 0$	High spenders, old and one-off buyers	updates + special bundle
L_H_L	$\leq 0$	$\gt 0$	$\leq 0$	Low spenders, recent and one-off buyers	coupons
L_L_H	$\leq 0$	$\leq 0$	$\gt 0$	Low spenders, old and recurring buyers	updates +coupons
L_L_L	$\leq 0$	$\leq 0$	$\leq 0$	Low spenders, old and one-off buyers	ask for feedback

PS: Depending on the business' products price range, customers belonging to the group H_H_L and H_L_L can be more valuable than those belonging to H_H_H.

Example:
Assuming your business sells' two products, one is sold at $10 and the other at $1000; and giving that:

H_H_H means Customers who spent a lot, just bought recently, and bought multiple times.
H_H_L means Customers who spent a lot, just bought recently, and bought fewer times.
H_H_L means Customers who spent a lot, bought recently long ago, and bought fewer times.

H_H_L and H_L_L groups bought fewer times, yet they were able to rank among the top spenders, it means that they bought the expensive product.
While group H_H_H, who bought multiple times, were able to rank among the top spenders because they bought the cheap product multiple times.

4. Final thoughts

Can we use more than variables in the grouping phase?

Yes

Can we have multiple ranks?

Yes, instead of H and L, we can have 1, 2, and 3 up to 10 if you desire.
Using the K-means algorithm, there is a method called the elbow method, it determines what is the optimal number of rank to use in a giving data set.

Group	T_S	Customer description	Action
H	\(\gt 0\)	High spenders	loyalty gift
l	\(\leq 0\)	Low spenders	coupon