x[5]
take the 5th element in x
x[c(2,5)]
takes the 2nd and the 5th elements in
x
x[2:5]
takes the 2nd, 3rd, 4th and the 5th elements in
x
First, let’s define some vectors.
freq = c(3L, 5L, 1L, 1L, 3L) # integer vector
amount = c(100, 168, 180, 280, 199) # numeric
member = c(FALSE, TRUE, FALSE, TRUE, TRUE) # logical
name = c("Amy", "Bob", "Cindy", "Danny", "Edward") # character
gender = as.factor( c("F", "M", "F", "M", "M") ) # factor
class = as.factor(c('A','B','A','B','A')) # factor
🌻 Position indexes take the format of
[1] 3 3
[1] M F M
Levels: F M
[1] M
Levels: F M
An atomic object is treated as an vector of length 1
[1] 100 168 180 199
Index can be used to
[1] 100 168
[1] 100 168 168 180 180 180 280 280 280 280
or
[1] 199 280 180 168 100
amount
is an unnamed numeric vector
[1] 100 168 180 280 199
NULL
We can assign a name to each element of an collective objects
Amy Bob Cindy Danny Edward
100 168 180 280 199
Now amount
becomes an named numeric
vector, and we can access its elements by names.
🌻 A name index is an
Bob Cindy
168 180
🌻 Conditional indexes are
[1] 3 5 3
🌻 Logical indexes let us select elements
Cindy Danny
180 280
🗿 QUIZ:
With the vectors
defined below …
noBuy = c(3L, 5L, 1L, 1L, 3L) # Integer
height = c(175, 168, 180, 181, 169) # numeric
isMale = c(FALSE, TRUE, FALSE, TRUE, TRUE) # logical
name = c("Amy", "Bob", "Cindy", "Danny", "Edward") # character
gender = factor( c("F", "M", "F", "M", "M") ) # factor
skin_color = factor( c("black", "black", "white", "yellow", "white") ) # factor
Use index and math functions to answer the following questions
…
🗿: list the name of males
[1] "Bob" "Danny" "Edward"
🗿: list the names of those who higher than 180
🗿: list the names of those who higher than 180 and skin color is “yellow”
🗿: calculate the average height of males
[1] 172.7
🗿: calculate the total number of
buys (noBuy
) by females
🗿: count the number of white female
Data frame is the most common and useful data structure. Usually
df = data.frame(
noBuy = c(3L, 5L, 1L, 1L, 3L),
height = c(175, 168, 180, 181, 169),
isMale = c(FALSE, TRUE, FALSE, TRUE, TRUE),
name = c("Amy", "Bob", "Cindy", "Danny", "Edward"),
gender = factor( c("F", "M", "F", "M", "M") ),
skin_color = factor( c("black", "black", "white", "yellow", "white")),
stringsAsFactors=FALSE
)
Data frame is easier to
noBuy height isMale name gender skin_color
1 3 175 FALSE Amy F black
2 5 168 TRUE Bob M black
3 1 180 FALSE Cindy F white
4 1 181 TRUE Danny M yellow
5 3 169 TRUE Edward M white
to
noBuy height isMale name gender skin_color
2 5 168 TRUE Bob M black
to
F M
2 3
to
[1] 174.6
to
F M
177.5 172.7
Data Frames are a two-dimensional objects, so they take two indexes:
df[ row_idx, col_idx ]
We can index data frame by all three forms of index:
df[c(1,2), c(2,3)]
df[c(1,2), c("noBuy","height")]
df[df$gender=="M, c("noBuy","height")]
and some others extra indexing forms:
df[c(1,2), ]
selects all columns
(rows)$
): df$name
selects a
specific columnsubset()
& filter()
:
subset(df, height<175 & isMale)
subset(df, height<175 & isMale, name)
subset(df, height<175 & isMale)$name
subset(df, height<175 & isMale, c(name, noBuy))
Below are some examples …
height isMale
1 175 FALSE
2 168 TRUE
noBuy height isMale name gender skin_color
1 3 175 FALSE Amy F black
2 5 168 TRUE Bob M black
noBuy height isMale name gender skin_color
2 5 168 TRUE Bob M black
5 3 169 TRUE Edward M white
[1] "Bob" "Edward"
[1] "Bob" "Edward"
noBuy height isMale name gender skin_color
2 5 168 TRUE Bob M black
5 3 169 TRUE Edward M white
name
2 Bob
5 Edward
[1] "Bob" "Edward"
name noBuy
2 Bob 5
5 Edward 3
🗿 QUIZ:
Annotate the function
of each underlying code chunks as remarks …
For an example
[1] "Bob" "Danny" "Edward"
[1] "Danny"
[1] "Amy" "Cindy"
[1] 172.7
[1] 177.5
[1] 4
[1] 1
[1] 1