Lists: Mixing different types of data

October 8, 2018

Digital Signature

a parenthesis

Do I have your file?

At the end of the exam you sent me a file

How can we verify that we have the same file?

How can we be sure that nobody changed it?

How to be sure without showing the content of the file?

Digital signature

An answer to these question is given by digital signatures

They are not digital pictures of a handwritten signature

Instead they are a unique number that identifies the exact document

This number is called digest. It is produced by a crypotgraphic hash function

Criptographic hash function

MD5 hash function

The input is a file (all the characters)
The output is the digest
The same input produces always the same digest
Different inputs produce different digests
If the input changes, the digest changes
If the input changes a little, the digest changes a lot

How do you validate the file?

Go to http://onlinemd5.com/ or any other service you find on Google

The evaluation is done in your computer. The file is not sent by the internet

You can take the file you attached, get the digest and compare with the one I created

If they are the same we are sure that I have your file

And we do not need to show the content

Application: Intellectual Property

Imagine you are working in a project

You have an idea, a draft or some data that is confidential
You do not want to make it public (yet)
But you want to show that you have this document today

You can get the MD5 digest and publish it

on a newspaper
on Facebook
or anywhere you can look back and show the date and the digest

Basic objects in R

In the previous chapter…

Objects in R:

There are several data types:
- numeric, character, logic, factor
They are stored in one of many data structures
- For example: vectors
Each element can be accessed using indices
- numeric vectors (positive or negative)
- logical vectors
- character vector

Indices

Indices allow us to see and modify parts of a vector
Indices can be
- positive integer vectors
- negative integer vectors
- logic vectors
- character vectors
Index vectors can be of length 1 or longer
- except logic indices, which have to be of the same size as the original vector

Lists

Like vectores, but mixing different kinds of elements

people <- list(c(60, 72, 57, 90, 95, 72),
               c(1.75, 1.80, 1.65, 1.90, 1.74, 1.91),
               c("Ali", "Deniz", "Fatma", "Emre",
                 "Volkan", "Onur"),
               TRUE, c(2017, 10, 10),
               factor(c("M","F","F","M","M","M")))

Notice that elements can have different length

Result

people

[[1]]
[1] 60 72 57 90 95 72

[[2]]
[1] 1.75 1.80 1.65 1.90 1.74 1.91

[[3]]
[1] "Ali"    "Deniz"  "Fatma"  "Emre"   "Volkan" "Onur"  

[[4]]
[1] TRUE

[[5]]
[1] 2017   10   10

[[6]]
[1] M F F M M M
Levels: F M

Indexing Lists

Can be indexed same as vectors
Returns a sub-list

people[1:2]

[[1]]
[1] 60 72 57 90 95 72

[[2]]
[1] 1.75 1.80 1.65 1.90 1.74 1.91

Elements versus sublists

This is a sublist (with one element):

people[1]

[[1]]
[1] 60 72 57 90 95 72

This is an element:

people[[1]]

[1] 60 72 57 90 95 72

Lists elements can have names

people <- list(weight=c(60, 72, 57, 90, 95, 72),
               height=c(1.75, 1.80, 1.65, 1.90, 1.74, 1.91),
               names=c("Ali", "Deniz", "Fatma", "Emre",
                       "Volkan", "Onur"),
               valid=TRUE, YMD=c(2017, 10, 10),
               gender=factor(c("M","F","F","M","M","M")))

How else can we assign names?

Lists with Names

people

$weight
[1] 60 72 57 90 95 72

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

$names
[1] "Ali"    "Deniz"  "Fatma"  "Emre"   "Volkan" "Onur"  

$valid
[1] TRUE

$YMD
[1] 2017   10   10

$gender
[1] M F F M M M
Levels: F M

Indexing Lists with Names

Can be indexed same as vectors
Returns a sub-list

people[1:2]

$weight
[1] 60 72 57 90 95 72

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

Elements of Lists with Names

This is a sublist:

people[1]

$weight
[1] 60 72 57 90 95 72

This is an element:

people[[1]]

[1] 60 72 57 90 95 72

Accessing single elements

people[[1]]

[1] 60 72 57 90 95 72

people[["weight"]]

[1] 60 72 57 90 95 72

Shortcut to index a single element

people$weight

[1] 60 72 57 90 95 72

Changing parts of a List

Indices can also be used to change specifc parts of a list.

For example we can update the names

people$names <- toupper(people$names)
people$names

[1] "ALI"    "DENIZ"  "FATMA"  "EMRE"   "VOLKAN" "ONUR"

Deleting list elements

people$valid <- NULL
people$YMD <- NULL
people

$weight
[1] 60 72 57 90 95 72

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

$names
[1] "ALI"    "DENIZ"  "FATMA"  "EMRE"   "VOLKAN" "ONUR"  

$gender
[1] M F F M M M
Levels: F M

Adding new list elements

people$BMI <- people$weight/people$height^2
people

$weight
[1] 60 72 57 90 95 72

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

$names
[1] "ALI"    "DENIZ"  "FATMA"  "EMRE"   "VOLKAN" "ONUR"  

$gender
[1] M F F M M M
Levels: F M

$BMI
[1] 19.59184 22.22222 20.93664 24.93075 31.37799 19.73630

Indexing Lists

List elements are indexed by [[]]
Sublists are indexed by []

Try these

people[[2]]
people[2]
people[[2]][3]
people[2][3]
people[[1:3]]
people[1:3]
people[["weight"]]
people$weight
people["weight"]

Result

people[[2]]

[1] 1.75 1.80 1.65 1.90 1.74 1.91

people[2]

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

Result

people[[2]][3]

[1] 1.65

people[2][3]

$<NA>
NULL

Result

people[[1:3]]

Error in people[[1:3]]: recursive indexing failed at level 2

people[1:3]

$weight
[1] 60 72 57 90 95 72

$height
[1] 1.75 1.80 1.65 1.90 1.74 1.91

$names
[1] "ALI"    "DENIZ"  "FATMA"  "EMRE"   "VOLKAN" "ONUR"

Result

people[["weight"]]

[1] 60 72 57 90 95 72

people$weight

[1] 60 72 57 90 95 72

people["weight"]

$weight
[1] 60 72 57 90 95 72

Quiz

If key <- "names",

What is the diference between the following?

people[[key]]
people[[names]]
people$key
people$names

Explain