TF7_Assumption_Simulation

Load data and packages

options(scipen=10)
pacman::p_load(latex2exp,Matrix,dplyr,tidyr,ggplot2,caTools,plotly)
rm(list=ls(all=TRUE))
load("data/tf4.rdata")

A. Examine the Predictions

B$Buy : Buying Probability
B$Rev : Expected Revenue Contribution

par(mfrow=c(1,2), cex=0.8)
hist(B$Buy)
hist(log(B$Rev,10))

group_by(B,age) %>% 
  summarise(n=n(), Buy=mean(Buy), Rev=mean(Rev)) %>% 
  ggplot(aes(Buy,Rev,size=n,label=age)) + 
  geom_point(alpha=0.5,color='gold') + 
  geom_text(size=4) + 
  labs(title="Age Group Statistics (size: no. customers)") +
  scale_size(range=c(4,20)) + theme_bw()  -> p
ggplotly(p)

B. Flexiable Asssumption with Parameters

§ The S-Curve

🌻 S-Curve: a common cost effect function for most of the instruments

🌻 The built-in plogis() function can be used to emulate S curves

\[\Delta P(x|m,b,a) = m \cdot Logis(\frac{10(x - b)}{a})\]

# We define a `DP()` function so that its parameters specify
#   `m` : the height of the S curve - the maximum effect 
#   `b` : the mid point of the rising slope - the medium cost
#   `a` : the width of the rising slope - the sensitive range of effect
# respectively.
#
DP = function(x,m0,b0,a0) {m0*plogis((10/a0)*(x-b0))}
par(mar=c(4,4,2,1),cex=0.7)
curve(DP(x,m=0.20,b=30,a=40), 0, 60, lwd=2, ylim=c(0, 0.25),
      main="F( x | m=0.2, b=30, a=40 )", ylab="delta P")
abline(h=seq(0,0.2,0.05),v=seq(0,60,5),col='lightgrey',lty=2)

§ Parameters in the S Curve

🌻 parameters for flexibility

🌻 With these three parameters:

m : the height of the S curve - the maximum effect
b : the mid point of the rising slope - the medium cost
a : the width of the rising slope - the effective range of cost

🌻 With parameters, all possible S curves can be generated with the same code

Thereof, we can write one program to emulate all possible Cost-Effect Functions that specify the relationship between the cost ($x$, the face values of cash coupons) and The effect ($\Delta p$, the increment of buying probability) of the marketing instrument.

§ Estimate the Expected Payoff

With the Cost-Effect Function for an Instrument, we can evaluate the net expected payoff when apply it to anyone of our customers. Because probability is cap at 1, we need a conditional equation as below.

\[\hat{R}(x) = \left\{\begin{matrix} \Delta P \cdot M \cdot margin - x & , & P + \Delta P \leq 1\\ (1-P) \cdot M \cdot margin - x & , & else \end{matrix}\right.\]

🌻 By combining …

Predictions ($P, M$): Customers’ Buying Probability & Expected Buying Amount
Assumption $\Delta P(x|m,b,a)$: The increments of Buying Probability

we can evaluate the Net Expected Payoff when apply it to anyone of our customers - $\hat{R_i}(x)$

🌻 Note that both $\Delta P$ and $\hat{R}$ are functions of $x$ given $m,b,a$

$P, M$ are Predictions acquired from the models
$m, b, a$ - the attribute of marketing instruments, are Assumptions
$x$ - Strength of Marketing (cost), is an Strategic Variable to be optimized

Assuming the Raw Margin $m$

# load(data/tf0.rdata)
# group_by(Z0, age) %>% summarise(sum(price)/sum(cost) - 1)
margin = 0.17  # assume margin = 0.17

Estimate Net Expected Payoff $\hat{R_i}(x|m,b,a)$

m=0.2; b=25; a=40; x=30
dp = pmin(1-B$Buy, DP(x,m,b,a))
eR = dp*B$Rev*margin - x
hist(eR,main="Dist. of Net Expected Payoff",
     xlab="Net Expected Payoff",ylab="Number of Customers")

§ Group Execise

According to the previous estimation …

🚴 How many customers’ have positive expected payoffs？ (eR > 0)?

sum(eR>0)

## [1] 6679

🚴 If we apply this instrument to every customer, what is the Total Expected Payoff？

sum(eR)

## [1] -202435.4

🚴 What if we only apply to those who have positive expected payoff …

sum(eR[ eR>0 ])

## [1] 80358.8

🚴 What if we only apply to those who have expected payoff larger than 10 …

sum(eR[ eR>10 ])

## [1] 63812.07

🚴 What if we only apply to those who live in z1115 and have expected payoff larger than 10 …

sum(eR[ eR>10 & B$area=="z115"   ])

## [1] 12532.01

C. Market Simulation

§ One Markerting Instrument

Given parameters ($m,b,a$), we can estimate and visualize how its effect varies with $x$:

eReturn: Total Expected Payoff when apply to All Customers
N: The number of customers whose expected payoff is positive
eReturn2: Total Expected Payoff when apply to those who has positive payoff

Note that all of the above numbers vary with the strength of marketing (cost) - $x \in [10,45]$.

m=0.2; b=25; a=40; X = seq(10,45,1)

df = sapply(X, function(x) {
  dp = pmin(DP(x,m,b,a),1-B$Buy)
  eR = dp*B$Rev*margin - x
  c(x=x, eReturn=sum(eR), N=sum(eR > 0), eReturn2=sum(eR[eR > 0]))
  }) %>% t %>% data.frame  

df %>% gather('key','value',-x) %>% 
  ggplot(aes(x=x, y=value, col=key)) + 
  geom_hline(yintercept=0,linetype='dashed') +
  geom_line(size=1.5,alpha=0.5) + 
  facet_wrap(~key,ncol=1,scales='free_y') + theme_bw()

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

§ Multiple Instruments

With some modification of the code, we can define and simulate multiple (4) instruments at once. First, let’s define the instruments by specifying their parameter combinations.

mm=c(0.20, 0.25, 0.15, 0.25)
bb=c(  25,   30,   15,   30)
aa=c(  40,   40,   30,   60) 
X = seq(0,60,2) 
do.call(rbind, lapply(1:length(mm), function(i) data.frame(
  Inst=paste0('Inst',i), Cost=X, 
  Gain=DP(X,mm[i],bb[i],aa[i])
  ))) %>% data.frame %>% 
  ggplot(aes(x=Cost, y=Gain, col=Inst)) +
  geom_line(size=1.5,alpha=0.5) + theme_bw() +
  ggtitle("Prob. Function: f(x|m,b,a)")

Now run simulation on these instrument to compare their cost effectiveness.

X = seq(10, 60, 1) 
df = do.call(rbind, lapply(1:length(mm), function(i) {
  sapply(X, function(x) {
    dp = pmin(1-B$Buy, DP(x,mm[i],bb[i],aa[i]))
    eR = dp*B$Rev*margin - x
    c(i=i, x=x, eR.ALL=sum(eR), N=sum(eR>0), eR.SEL=sum(eR[eR > 0]) )
    }) %>% t %>% data.frame
  })) 

df %>% 
  mutate_at(vars(eR.ALL, eR.SEL), function(y) round(y/1000)) %>% 
  gather('key','value',-i,-x) %>% 
  mutate(Instrument = paste0('I',i)) %>%
  ggplot(aes(x=x, y=value, col=Instrument)) + 
  geom_hline(yintercept=0, linetype='dashed', col='blue') +
  geom_line(size=1.5,alpha=0.5) + 
  xlab('Cost') + ylab('Expected Payoff($K)') + 
  ggtitle('Comparing Marketing Instruments',
          'assuming the effect is a function of cost') +
    facet_wrap(~key,ncol=1,scales='free_y') + theme_bw() -> p

plotly::ggplotly(p)

With in df we have the columns

i : instrument id
x : unit cost
eR.ALL : total expected payoff when apply to all customers
N : the number of customers with positive expected net payoff
eR.SEL : total expected payoff when only apply to the N customers

The Optimal Setting for each instruments can be extracted simply by …

group_by(df, i) %>% top_n(1,eR.SEL)

## # A tibble: 4 × 5
## # Groups:   i [4]
##       i     x   eR.ALL     N  eR.SEL
##   <dbl> <dbl>    <dbl> <dbl>   <dbl>
## 1     1    34 -215958.  7027  98033.
## 2     2    40 -194886.  8344 146568.
## 3     3    22  -49192. 10262 108802.
## 4     4    43 -305864.  6495 112184.

D. Discussion

par(cex=0.7, mar=c(2,2,1,2))
table(B$age) %>% barplot  # the number of customers in every age groups

🚴 討論：
If the 4 parameter combinations above each represents the effect of an instrument to certain age group:
■ I1 : a24, a29
■ I2 : a34, a39
■ I3 : a44, a49
■ I4 : a54, a59, a64, a69
Please find the optimal strategy for each age group in terms of:
■ the number of customers to be marketed (N)？
■ the optimal cost of instrument (x)？
■ the total expected payoff (eR.SEL)？

ci = sapply(
  list(c("a24","a29"),c("a34","a39"),
       c("a44","a49"),c("a54","a59","a64","a69")), 
  function(v) B$age %in% v)  

X = seq(10, 60, 1) 
df = do.call(rbind, lapply(1:length(mm), function(i) {
  sapply(X, function(x) {
    dp = pmin(1- B$Buy[ ci[,i] ]  , DP(x,mm[i],bb[i],aa[i]))
    eR = dp* B$Rev[ ci[,i] ]  *margin - x
    c(i=i, x=x, eR.ALL=sum(eR), N=sum(eR>0), eR.SEL=sum(eR[eR > 0]) )
    }) %>% t %>% data.frame
  })) 

group_by(df, i) %>% top_n(1,eR.SEL)

## # A tibble: 4 × 5
## # Groups:   i [4]
##       i     x  eR.ALL     N eR.SEL
##   <dbl> <dbl>   <dbl> <dbl>  <dbl>
## 1     1    34 -51966.   560  6472.
## 2     2    40 -29651.  4083 74282.
## 3     3    22  -4068.  3131 34746.
## 4     4    43 -84668.   643  9403.