Using some series of stock prices, in this notebook we will learn how to …

We will also learn how to

First of all, we read.csv() the following data files + data/IBMStock.csv + data/GEStock.csv + data/ProcterGambleStock.csv + data/CocaColaStock.csv + data/BoeingStock.csv

into data frames IBMGEPnGCocaCola and Boeing and put these data frame in a list object L.

L = list(
  IBM = read.csv("data/IBMStock.csv"),
  GE = read.csv("data/GEStock.csv"),
  PnG = read.csv("data/ProcterGambleStock.csv"),
  CocaCola = read.csv("data/CocaColaStock.csv"),
  Boeing = read.csv("data/BoeingStock.csv"))

We say a list is a collective object because it accommodate more than one sub-elements. Collectives are also regarded as iteratives, because we can apply some function on each of their elements repeatedly. For examples …

🌻 lapply(x, fun) applies fun to each element of x and return the results in a list

lapply(L, class)
$IBM
[1] "data.frame"

$GE
[1] "data.frame"

$PnG
[1] "data.frame"

$CocaCola
[1] "data.frame"

$Boeing
[1] "data.frame"

🌻 sapply(x, fun) do the same thing as lapply plus it simplified the resultant object whenever it’s possible …

sapply(L, class)
         IBM           GE          PnG     CocaCola       Boeing 
"data.frame" "data.frame" "data.frame" "data.frame" "data.frame" 

Here it returns a named vector which is simpler than a list.

sapply(L, names)
     IBM          GE           PnG          CocaCola     Boeing      
[1,] "Date"       "Date"       "Date"       "Date"       "Date"      
[2,] "StockPrice" "StockPrice" "StockPrice" "StockPrice" "StockPrice"

In one line of code, we see that there are two columns Date and StockPrice in each of these data frames

Besides the build in functions, we can define our own functions, For an example …

L = lapply(L, function(df) {
  df$Date =  as.Date(df$Date, format="%m/%d/%y")
  df
  } )

We define and apply a function that takes a data frame df, converts the Date column and returns the data frame. When we apply it to L, we accomplish 5 date conversion operations in one shoot.

You’d find it easier in answering the following questions by lapply and sapply.

Section-1 Summary Statistics

§ 1.1 Our five datasets all have the same number of observations. How many observations are there in each data set?

#
#

§ 1.2 What is the earliest year in our datasets?

#
#

§ 1.3 What is the latest year in our datasets?

#
#

§ 1.4 What is the mean stock price of IBM over this time period?

#
#

§ 1.5 What is the minimum stock price of General Electric (GE) over this time period?

#
#

§ 1.6 What is the maximum stock price of Coca-Cola over this time period?

#
#

§ 1.7 What is the median stock price of Boeing over this time period?

#
#

§ 1.8 What is the standard deviation of the stock price of Procter & Gamble over this time period?

#
#


Section-2 Visualizing Stock Dynamics

§ 2.1 Around what year did Coca-Cola has its highest stock price in this time period? Around what year did Coca-Cola has its lowest stock price in this time period?

#
#

§ 2.2 In March of 2000, the technology bubble burst, and a stock market crash occurred. According to this plot, which company’s stock dropped more?

#
#

§ 2.3 (a) Around 1983, the stock for one of these companies (Coca-Cola or Procter and Gamble) was going up, while the other was going down. Which one was going up?

#
#
  1. In the time period shown in the plot, which stock generally has lower values?
#
#


Section-3 Visualizing Stock Dynamics 1995-2005

§ 3.1 Which stock fell the most right after the technology bubble burst in March 2000?

#
#

§ 3.2 Which stock reaches the highest value in the time period 1995-2005?

#
#

§ 3.3 In October of 1997, there was a global stock market crash that was caused by an economic crisis in Asia. Comparing September 1997 to November 1997, which companies saw a decreasing trend in their stock price? (Select all that apply.)

#
#

§ 3.4 In the last two years of this time period (2004 and 2005) which stock seems to be performing the best, in terms of increasing stock price?

#
#