1. R Programming Elements

💡 Intro. R Programming
R: An Interactive Data Object Computation Environment
  ■ Object: Data of all type and structure are objects
  ■ Expression: Combinations of object, operator and functions
  ■ Computation: Every expressions is evaluated into an resultant object
  ■ Object Name: Every resultant object can be assigned to a name
  ■ Program: A sequence of defining or updating objects


1.1 Object

An simple numeric object

100
[1] 100

An numeric vector

c(10,20,30)
[1] 10 20 30

An character vector

c("Amy","Bob","Candy")
[1] "Amy"   "Bob"   "Candy"

Every object has it’s data type/class and str/strcture (which will be elaborated latter.)

1.2 Expression

An object itself

100
[1] 100

Objects & Operators/Parentheses

(100+c(10,20,30))/10
[1] 11 12 13

Objects, Operators/Parentheses & Functions

sqrt( (100+c(10,20,30))/10 )  
[1] 3.317 3.464 3.606


1.3 Object Name

Assign the resultant object to the object name x

x = (100+c(10,20,30))/10

x is a numeric vector

x
[1] 11 12 13

🌻 Object Names:

  • Strings starting with an English letters (might be followed by digits) are acceptable names
  • Case Sensitive: Upper and lower-case letters are treated as different names
  • Object Names can be reused
  • When you assign a new value to an name, its original value is lost

1.4 Functions

vector function

sqrt(x)    # log(), exp()
[1] 3.317 3.464 3.606

summary function

sum(x)     # mean(), median()
[1] 36

cascaded functions

mean(log(sqrt(x), base=10))
[1] 0.5391

the pipe operator %>%

sqrt(x) %>% log(10) %>% mean
[1] 0.5391

🌻 Function:

  • Most R functions take vectors as their input (usually the 1st argument)
  • Basically there’re two type of functions:
    • vector functions: +,-,*,/,sqrt,log,exp,…
    • summary functions: sum,mean,median,max,min,…
  • Most functions have many arguments; Most arguments have default values
  • A balance of convenience and flexibility
  • Place cursor on the function name and press F1 to see the online help
  • Usually functions need to be cascaded
  • the pipe (%>%) operator make it easier to write/read cascaded function


2. Let’s do some Business Anaytics (BA)

2.1 Evaluating the expected outcome

The most common mission of BA is to estimate the expected outcome of uncertain event. Given an event which may lead to \(n\) different outcomes …

  • \(p_i\) : the probability of the \(i\)-th outcome, \(i \in [1, n]\)
  • \(v_i\) : the payoff of the \(i\)-th outcome

The expected value of its outcome can be evaluated as

\[\pi = \sum_{i=1}^{n} p_i \times v_i \qquad(1)\]

Define \(v_i\) & \(p_i\) as numeric vectors v and p

v = c(100,  50, -50,   0)
p = c(0.1, 0.2, 0.3, 0.4)

Compose formula (1) as an expression and assign the result of evaluation to object pi

pi = sum(p * v)

🌻 Originally R was designed as an tool for vector computation. Each pair of elements of p and v are multiplied separately.

Print out the value of pi

pi
[1] 5


2.2 A Simple Program

📋 Now we can use the program to evaluate the expected outcome of any event, as long as we know its \(p_i\) and \(v_i\). For example …

v = c(-150, 20, 50)
p = c(0.3, 0.3, 0.4)
sum(p * v)
[1] -19


2.3 A Longer Program

📋 If we have a list of events, we can specify the event in a list object

cases = list(
  case1 = data.frame(v = c(100, 50, -50, 0), p=c(0.1, 0.2, 0.3, 0.4)),
  case2 = data.frame(v = c(-100, 50, 50), p=c(0.5, 0.4, 0.1)),
  case3 = data.frame(v = c(50, -150), p=c(0.6, 0.4)),
  case4 = data.frame(v = c(-100, -50, 0, 200), p=c(0.2, 0.2, 0.3, 0.3)),
  case5 = data.frame(v = c(-70, 50, 80), p=c(0.1, 0.5, 0.4)),
  case6 = data.frame(v = c(50, -150), p=c(0.4, 0.6))
  )

Then, we can evaluate and compare their expected outcomes in a few lines of code.

par(cex=0.75)
sapply(cases, function(cs) sum(cs$p * cs$v) ) %>% sort %>% 
  barplot(main="Expected Outcomes")
abline(h=seq(-80,40,20),lty=3)