CMB2:

# Homework 4

20 April 2018. Deadline: Wednesday, 25 April, 9:00.

Epilepsy affects 1% of the World population. Thus, the probability for one person is c(Yes=0.01, No=0.99). We can simulate n persons with the following code:

epilepsy <- function(n) {
return(
sample(c("Yes","No"),
size=n,
replace = TRUE,
prob = c(0.01, 0.99)
)
)
}

We are 97 people in the course, including me. To simulate a group like us we use

epilepsy(97)
##  [1] "No"  "No"  "No"  "No"  "Yes" "No"  "No"  "No"  "No"  "No"  "No"
## [12] "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"
## [23] "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"
## [34] "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"
## [45] "No"  "No"  "Yes" "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"
## [56] "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"
## [67] "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"
## [78] "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"
## [89] "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"  "No"

We replicate the experiment to get 1000 samples of size 97

courses <- replicate(1000, epilepsy(97))
dim(courses)
## [1]   97 1000

Here courses is a matrix, not a data frame. We have to use [row, col], not [[column]].

We got a lot of data, but very little information. We did not learn anything from the 97000 words. We want numbers, not words. We need a function that

• takes one column of courses, and
• gives us an integer with the number of “Yes”

This is your mission. Write the code to produce a vector named cases of size 1000, with the number of “Yes” in each column. If everything goes right you will get something similar to this:

table(cases)
## cases
##   0   1   2   3   4
## 356 376 190  69   9
barplot(table(cases))

