🌻 read.csv() - 讀取CSV(逗號分隔值)的文件

D = read.csv("data/comics1.csv")


5. Counts and Fractions

5.1 Obtain Counts and Fractions via Logical Test

🌻 sum() of an logical vector produce the number of TRUE’s

[1] 3178


🌻 mean() of an logical vector produce the fraction of TRUE

[1] 0.4383


5.2 Tables and Proportionate Tables

🌻 table() lists and counts each distinct values in categorical (factor or chr)


    Bad    Good Neutral 
   2915    3178    1157 


🌻 prop.table() convert counts into fractions


    Bad    Good Neutral 
 0.4021  0.4383  0.1596 


What happen if we put two variables in table()

         
           Female    Male
  Bad     0.09324 0.30883
  Good    0.17434 0.26400
  Neutral 0.05959 0.10000


❓ Check the online help of prop.table. What does argument margin works?

         
           Female    Male
  Bad     0.09324 0.30883
  Good    0.17434 0.26400
  Neutral 0.05959 0.10000
         
          Female   Male
  Bad     0.2319 0.7681
  Good    0.3977 0.6023
  Neutral 0.3734 0.6266
         
          Female   Male
  Bad     0.2850 0.4590
  Good    0.5329 0.3924
  Neutral 0.1821 0.1486


🏆 Group Competition Round 1

1. How many bad characters do we have? 2915
2. How many bad male characters?? 2239
3. What is the fraction of bad characters? 0.4021
4. What is the fraction of bad male characters? 0.3088
5. What fraction of male characters are bad? 0.4590
6. What fraction of bad characters are male? 0.7681
7. Which gender has more neutral people? Male
8. Females are more likely to be neutral. TREU, 0.1821 > 0.1486


6. Category and Group Operations

6.1 Statistics by Categories

Actually there is a better way to answer the last two questions above.

🌻 tapply(value, group, fun) applies fun to value by each distinct group

Female   Male 
   432    725 

Counts the number of neutral by sex

Female   Male 
0.1821 0.1486 

Calculate the fraction of neutral characters by sex


Let’s do some practices,

  • What are the fractions of female in each hair color ❓
   Bald   Black   Blond   Brown      No  others     Red   White 
0.05921 0.34754 0.46642 0.26921 0.09186 0.35211 0.54839 0.22350 
  • What are the number of female in each eye color ❓
 Black   Blue  Brown  Green others    Red  White Yellow 
   153    881    669    340    118     81     79     51 


🏆 Group Competition Round 2

1.What is the fraction of male in blue eye color? 0.6443

 Black   Blue  Brown  Green others    Red  White Yellow 
0.7528 0.6443 0.6973 0.4880 0.6713 0.8085 0.7285 0.7536 

2.How many males are in blue eye color? 1596

 Black   Blue  Brown  Green others    Red  White Yellow 
   466   1596   1541    324    241    342    212    156 

3.What is the fraction of alive in male? 0.7071

Female   Male 
0.7757 0.7071 

4.Which align is more likely to stay alive? Good guys

    Bad    Good Neutral 
 0.6816  0.7634  0.7571 

5.Which fraction of hair colors for bad characters is wrong? Black 0.3042


   Bald   Black   Blond   Brown      No  others     Red   White 
0.05523 0.30909 0.12419 0.19485 0.10943 0.08508 0.05832 0.06381 

6.What are the most likely hair and eye colors for bad characters? Black hair & blue eye


   Bald     Red   White  others      No   Blond   Brown   Black 
0.05523 0.05832 0.06381 0.08508 0.10943 0.12419 0.19485 0.30909 

 Yellow  others   White   Black     Red   Green   Brown    Blue 
0.04048 0.05146 0.05626 0.09640 0.09880 0.09914 0.27033 0.28714 



🦋 WRAP UP

Given the question - What is the fraction of male in blue eye color ❓ There are several ways to answer the same question.

[1] 0.6443
 FALSE   TRUE 
0.6876 0.6443 
 Black   Blue  Brown  Green others    Red  White Yellow 
0.7528 0.6443 0.6973 0.4880 0.6713 0.8085 0.7285 0.7536 
        
          Black   Blue  Brown  Green others    Red  White Yellow
  Female 0.2472 0.3557 0.3027 0.5120 0.3287 0.1915 0.2715 0.2464
  Male   0.7528 0.6443 0.6973 0.4880 0.6713 0.8085 0.7285 0.7536

Which of the following statement are correct ❓

  1. Males are more likely to have Red Hair
  2. Red-hairs are more likely to be Male
  3. We have more green-eye females than any other eye colors
  4. We have more green-eye females than green-eye males

🌷 It is not easy, is it?

  • There are many ways to answer a same question
  • table and tapply generates group summaries for comparison
  • Verbose might incur confusion


🦋 KEY POINTS:

  • table and tapply are literally the most powerful functions in R.
  • Combining with indexing, they can extract and compare vertually everything.
  • Practice and Experience makes Fluency.