options(scipen=10, digits=3)
rm(list=ls(all=TRUE))
pacman::p_load(dplyr, ggplot2, arules, arulesViz)
load("data/tf0.rdata")
以下我們示範如何利用購物籃分析,找出會帶來高獲利品項銷售的關聯規則。
tr
做購物籃分析之前,需要將訂單裡面的品項製作成一個transactions
物件(tr
)
## transactions in sparse format with
## 119422 transactions (rows) and
## 23789 items (columns)
然後使用arules::apriori()
這個方法找出品項間的關聯法則(Association Rules);通常我們會先放寬限制條件,先找一組可能用到的法則。
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.25 0.1 1 none FALSE TRUE 5 0.0001 1
## maxlen target ext
## 10 rules FALSE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 11
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[23789 item(s), 119422 transaction(s)] done [0.57s].
## sorting and recoding items ... [10166 item(s)] done [0.02s].
## creating transaction tree ... done [0.07s].
## checking subsets of size 1 2 3 4 5 6 7 8 9 done [0.70s].
## writing ... [9795 rule(s)] done [0.20s].
## creating S4 object ... done [0.05s].
## set of 9795 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3 4 5 6 7 8 9
## 1143 3362 2429 1385 874 448 136 18
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.00 3.00 4.00 3.95 5.00 9.00
##
## summary of quality measures:
## support confidence lift count
## Min. :0.00010 Min. :0.250 Min. : 4 Min. : 12
## 1st Qu.:0.00012 1st Qu.:0.433 1st Qu.: 75 1st Qu.: 14
## Median :0.00017 Median :0.632 Median : 264 Median : 20
## Mean :0.00023 Mean :0.638 Mean : 560 Mean : 28
## 3rd Qu.:0.00023 3rd Qu.:0.857 3rd Qu.:1106 3rd Qu.: 27
## Max. :0.00586 Max. :1.000 Max. :5331 Max. :700
##
## mining info:
## data ntransactions support confidence
## tr 119422 0.0001 0.25
然後我們可以設定條件,找到會帶來高營收品項(rhs
)的關聯規則(lhs => rhs
):
support
: lhs
品項被購買的基礎機率confidence
: lhs
品項被購買時rhs
被購買的機率lift
: lhs
品項被購買時,rhs
被購買所增加機率的倍數count
: 交易筆數(交易筆數如果太少,分析就沒有實質意義)## lhs rhs support confidence lift count
## [1] {4716114000312} => {4716114000329} 0.001273 0.553 231.6 152
## [2] {4716114000329} => {4716114000312} 0.001273 0.533 231.6 152
## [3] {4710154015138} => {4710154015206} 0.000996 0.374 52.1 119
## [4] {4713754987614} => {4713754987607} 0.001139 0.304 78.5 136
## [5] {4713754987607} => {4713754987614} 0.001139 0.294 78.5 136
## [6] {4710011402026} => {4710011402019} 0.002822 0.674 90.2 337
## [7] {4710088414328} => {4710088414311} 0.001792 0.466 86.2 214
## [8] {4710011401142} => {4710011406123} 0.001532 0.413 50.3 183
## [9] {4710085172702} => {4710085172696} 0.002428 0.540 62.0 290
## [10] {4710254049323} => {4710254049521} 0.002010 0.431 55.6 240
## [11] {4710011409056} => {4710011406123} 0.002629 0.414 50.4 314
## [12] {4710011409056} => {4710011401128} 0.004446 0.700 51.0 531
## [13] {4710085120093} => {4710085172696} 0.003743 0.498 57.2 447
## [14] {4710011401135} => {4710011401128} 0.005862 0.753 54.9 700
## [15] {4710011401142,
## 4710011409056} => {4710011401128} 0.001197 0.745 54.3 143
## [16] {4710011401135,
## 4710011401142} => {4710011406123} 0.000888 0.484 59.0 106
## [17] {4710011401135,
## 4710011401142} => {4710011401128} 0.001390 0.758 55.3 166
## [18] {4710011401142,
## 4710011405133} => {4710011401128} 0.001248 0.687 50.1 149
## [19] {4710011401128,
## 4710011401142} => {4710011406123} 0.001013 0.457 55.6 121
## [20] {4710085120093,
## 4710085172702} => {4710085172696} 0.001348 0.654 75.2 161
## [21] {4710085120093,
## 4710085172702} => {4710085120628} 0.001281 0.622 54.7 153
## [22] {4710085172696,
## 4710085172702} => {4710085120628} 0.001491 0.614 54.0 178
## [23] {4710085120628,
## 4710085172702} => {4710085172696} 0.001491 0.605 69.5 178
## [24] {4710011401135,
## 4710011409056} => {4710011406123} 0.001599 0.472 57.5 191
## [25] {4710011401135,
## 4710011409056} => {4710011401128} 0.002721 0.802 58.5 325
## [26] {4710011405133,
## 4710011409056} => {4710011406123} 0.001474 0.493 60.1 176
## [27] {4710011405133,
## 4710011409056} => {4710011401128} 0.002278 0.762 55.6 272
## [28] {4710011406123,
## 4710011409056} => {4710011401128} 0.001993 0.758 55.3 238
## [29] {4710011401128,
## 4710011409056} => {4710011406123} 0.001993 0.448 54.6 238
## [30] {4710085120093,
## 4710085172696} => {4710085120628} 0.002135 0.570 50.2 255
## [31] {4710085120093,
## 4710085120628} => {4710085172696} 0.002135 0.539 61.9 255
## [32] {4710011401135,
## 4710011405133} => {4710011406123} 0.001633 0.444 54.1 195
## [33] {4710011401135,
## 4710011405133} => {4710011401128} 0.002839 0.772 56.3 339
## [34] {4710011401135,
## 4710011406123} => {4710011401128} 0.002453 0.803 58.6 293
## [35] {4710011401128,
## 4710011401135} => {4710011406123} 0.002453 0.419 51.0 293
## [36] {4710011405133,
## 4710011406123} => {4710011401128} 0.002227 0.702 51.2 266
## [37] {4710011401128,
## 4710011405133} => {4710011406123} 0.002227 0.429 52.3 266
## [38] {4710011401135,
## 4710011401142,
## 4710011409056} => {4710011401128} 0.000946 0.856 62.5 113
## [39] {4710085120093,
## 4710085172696,
## 4710085172702} => {4710085120628} 0.000879 0.652 57.4 105
## [40] {4710085120093,
## 4710085120628,
## 4710085172702} => {4710085172696} 0.000879 0.686 78.8 105
## [41] {4710011401135,
## 4710011405133,
## 4710011409056} => {4710011406123} 0.000996 0.527 64.2 119
## [42] {4710011401135,
## 4710011405133,
## 4710011409056} => {4710011401128} 0.001574 0.832 60.7 188
## [43] {4710011401135,
## 4710011406123,
## 4710011409056} => {4710011401128} 0.001340 0.838 61.1 160
## [44] {4710011401128,
## 4710011401135,
## 4710011409056} => {4710011406123} 0.001340 0.492 60.0 160
## [45] {4710011405133,
## 4710011406123,
## 4710011409056} => {4710011401128} 0.001164 0.790 57.6 139
## [46] {4710011401128,
## 4710011405133,
## 4710011409056} => {4710011406123} 0.001164 0.511 62.3 139
## [47] {4710011401135,
## 4710011405133,
## 4710011406123} => {4710011401128} 0.001357 0.831 60.6 162
## [48] {4710011401128,
## 4710011401135,
## 4710011405133} => {4710011406123} 0.001357 0.478 58.2 162
## [49] {4710011401135,
## 4710011405133,
## 4710011406123,
## 4710011409056} => {4710011401128} 0.000862 0.866 63.1 103
## [50] {4710011401128,
## 4710011401135,
## 4710011405133,
## 4710011409056} => {4710011406123} 0.000862 0.548 66.8 103