October 18th, 2016

Basic objects in R

In the previous chapter…

Objects in R:

  • There are several data types:
    • numeric, character, logic, factor
  • They are stored in one of many data structures
    • vectors
    • matrices
  • Each element can be accessed using indices
    • numeric vectors (positive or negative)
    • logical vectors
    • character vector

Lists

Like vectores, but mixing different kinds of elements

> people <- list(c(60, 72, 57, 90, 95, 72),
+                c(1.75, 1.80, 1.65, 1.90, 1.74, 1.91),
+                c("Ali", "Deniz", "Fatma", "Emre", "Volkan", "Onur"),
+                TRUE,
+                factor(c("M","F","F","M","M","M")))

Notice that elements can have different length

Result

> people
[[1]]
[1] 60 72 57 90 95 72

[[2]]
[1] 1.75 1.80 1.65 1.90 1.74 1.91

[[3]]
[1] "Ali"    "Deniz"  "Fatma"  "Emre"   "Volkan" "Onur"  

[[4]]
[1] TRUE

[[5]]
[1] M F F M M M
Levels: F M

Indexing Lists

  • Can be indexed same as vectors
  • Returns a sub-list
> people[1:2]
[[1]]
[1] 60 72 57 90 95 72

[[2]]
[1] 1.75 1.80 1.65 1.90 1.74 1.91

Elements of Lists

This is a sublist (with one element):

> people[1]
[[1]]
[1] 60 72 57 90 95 72

This is an element:

> people[[1]]
[1] 60 72 57 90 95 72

Lists with Names

> people <- list(weight=c(60, 72, 57, 90, 95, 72),
+                height=c(1.75, 1.80, 1.65, 1.90, 1.74, 1.91),
+                names=c("Ali", "Deniz", "Fatma", "Emre", "Volkan", "Onur"),
+                valid=TRUE,
+                gender=factor(c("M","F","F","M","M","M")))

How else can we assign names?

Lists with Names

> people
$weight
[1] 60 72 57 90 95 72

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

$names
[1] "Ali"    "Deniz"  "Fatma"  "Emre"   "Volkan" "Onur"  

$valid
[1] TRUE

$gender
[1] M F F M M M
Levels: F M

Indexing Lists with Names

  • Can be indexed same as vectors
  • Returns a sub-list
> people[1:2]
$weight
[1] 60 72 57 90 95 72

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

Elements of Lists with Names

This is a sublist:

> people[1]
$weight
[1] 60 72 57 90 95 72

This is an element:

> people[[1]]
[1] 60 72 57 90 95 72

Accessing single elements

> people[[1]]
[1] 60 72 57 90 95 72
> people[["weight"]]
[1] 60 72 57 90 95 72
> people$weight
[1] 60 72 57 90 95 72

Changing parts of a List

Indices can also be used to change specifc parts of a list.

Try each of the following and explain the result:

> people$names <- toupper(people$names)
> people$BMI <- people$weight/people$height^2
> people$valid <- NULL

Indexing Lists

  • List elements are indexed by [[]]
  • sublists are indexed by []

Try these

> people[[2]]
> people[2]
> people[[2]][3]
> people[2][3]
> people[[1:3]]
> people[1:3]
> people[["weight"]]
> people$weight
> people["weight"]

Result

> people[[2]]
[1] 1.75 1.80 1.65 1.90 1.74 1.91
> people[2]
$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

Result

> people[[2]][3]
[1] 1.65
> people[2][3]
$<NA>
NULL

Result

> people[[1:3]]
Error in people[[1:3]]: recursive indexing failed at level 2
> people[1:3]
$weight
[1] 60 72 57 90 95 72

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

$names
[1] "ALI"    "DENIZ"  "FATMA"  "EMRE"   "VOLKAN" "ONUR"  

Result

> people[["weight"]]
[1] 60 72 57 90 95 72
> people$weight
[1] 60 72 57 90 95 72
> people["weight"]
$weight
[1] 60 72 57 90 95 72

Quiz

If key <- "names",

What is the diference between the following?

  • people[[key]]
  • people[[names]]
  • people$key
  • people$names

Explain

Exercise

Write a list with one element for each person, representing the name, weight, height and gender.

Data Frames

Data Frames

  • Bidimensional structure, like matrices
  • Each column can be of a different type but same length
  • All columns need a name
> ppl <- data.frame(weight=c(60, 72, 57, 90, 95, 72),
+                height=c(1.75, 1.80, 1.65, 1.90, 1.74, 1.91),
+                names=c("Ali", "Deniz", "Fatma", "Emre", "Volkan", "Onur"),
+                gender=factor(c("M","F","F","M","M","M")))

Data Frame

> ppl
  weight height  names gender
1     60   1.75    Ali      M
2     72   1.80  Deniz      F
3     57   1.65  Fatma      F
4     90   1.90   Emre      M
5     95   1.74 Volkan      M
6     72   1.91   Onur      M

Each column is a vector

If df is a data.frame, then df[[1]] is a vector

  • All elements of a column have the same data type
  • Different columns may have different types
    • In a matrix columns have all the same type
  • All columns have the same size
    • In a list the elements can have any size