Class 10: Logic and Text vectors

Computing for Molecular Biology 1

Andrés Aravena, PhD

9 November 2020

Computers handle numbers

Inside the computer, everything is stored as a number

But we can represent other things using numbers

There are other useful types of data

Numbers are not the only thing we will handle
R has four basic data types:

  • Numeric
  • Logic
  • Character
  • Factors

We have already spoke about numeric data
Here we will talk about other data types

Logic values

Try these expressions and see the result

You can try these two commands in your computer

2<3
[1] TRUE
2>3
[1] FALSE

The results are two keywords: TRUE and FALSE

All uppercase, no quotation marks

These are logical values

A logical value is an answer to a “yes or no” question

Storing logic values in variables

We can assign logical values, as we did with numbers.

a <- TRUE
a
[1] TRUE
b <- FALSE
b
[1] FALSE

Old values are lost forever

Each assignment overwrites the old value

The old values of a and b are forgotten

Comparisons

Most of times we do not assign TRUE or FALSE directly

Instead, we get logical values as the result of a comparison

1 < 2
[1] TRUE
1 > 2
[1] FALSE

Not-strict comparisons

In this case we ask if one is less or equal than the other

1 <= 2
[1] TRUE
1 >= 2
[1] FALSE

The symbol <= means “less than or equal to” (≤)

The symbol >= means “greater than or equal to” (≥)

Are the same or different?

The symbols == and != correspond to = and ≠

1 == 2
[1] FALSE
1 != 2
[1] TRUE

Important: we use == to compare two numbers
We are asking a question

It should not be confused with =, which is used for assignment, and is the same as <-

Use == to ask if things are equal

The following code is correct

1 == 2

It asks “are these two numbers equal?”
The answer is FALSE

On the other hand, this code is wrong

1 = 2

It says “let 1 be equal to 2”. That does not make sense

Is “this” one of “those”?

The special operator %in% asks if

1 %in% c(3, 2, 1, 0)
[1] TRUE

This corresponds to the question \[\text{Does } 1 ∈ \{3, 2, 1, 0\}\text{ ?}\] It is a question about sets

Exercise

If we have these two variables

a <- 4
b <- 5

What is the result of these commands?

a == b
a = b

Priority

What do we do first here?
Shall we add or compare?

1 + 4 == 2 + 3

Comparisons come after sum and minus

Thus, the code is equivalent to

5 == 5

The PEMDAS operation order is

  • Parenthesis
  • Exponents
  • Multiplications and Divisions
  • Additions and Subtractions
  • Comparisons

Logic vectors

Creating logic vectors

We can make logic vectors like we created numeric vectors

c(TRUE, TRUE, FALSE, TRUE)
[1]  TRUE  TRUE FALSE  TRUE
rep(FALSE, 7)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Abbreviation of TRUE and FALSE

  • T is the short of TRUE
  • F is the short of FALSE

So these two commands give the same result

c(TRUE, TRUE, FALSE, TRUE)
[1]  TRUE  TRUE FALSE  TRUE
c(T, T, F, T)
[1]  TRUE  TRUE FALSE  TRUE

T and F are for lazy people

Using T instead of TRUE saves some time

But it is not so clear when you read it

Moreover, someone may assign a value to T
(it may be temperature or time)

T <- 0

Then things do not work

Better use TRUE and FALSE

Comparing vectors and values

The most important way to make a logic vector is to compare a numeric vector against a fixed value

a <- c(3, 10, 2, 8, 6, 9, 1, 7, 5, 4)
a
 [1]  3 10  2  8  6  9  1  7  5  4
a <= 5
 [1]  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE  TRUE

Assigning the result into a variable

Compare a to a fixed value, and store the result

x <- a <= 5
x
 [1]  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE  TRUE

The comparison is done first

The assignment is done later

Vector x is a logic vector

Comparing two vectors

If two vectors have the same size, we can compare them

a
 [1]  3 10  2  8  6  9  1  7  5  4
b
 [1] 10  5  3  8  1  4  6  9  7  2
a <= b
 [1]  TRUE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE

Copares element-to-element

AND, OR, NOT

Operations with logic vectors

Take two vectors

These are all combinations of TRUE and FALSE

x <- c(TRUE, TRUE, FALSE, FALSE)
x
[1]  TRUE  TRUE FALSE FALSE
y <- c(TRUE, FALSE, TRUE, FALSE)
y
[1]  TRUE FALSE  TRUE FALSE

Combine with AND: x & y

The result is TRUE when both values are TRUE

x
[1]  TRUE  TRUE FALSE FALSE
y
[1]  TRUE FALSE  TRUE FALSE
x & y
[1]  TRUE FALSE FALSE FALSE

Combine with OR: x | y

The result is TRUE when any value is TRUE

x
[1]  TRUE  TRUE FALSE FALSE
y
[1]  TRUE FALSE  TRUE FALSE
x | y
[1]  TRUE  TRUE  TRUE FALSE

Combine with NOT: !x

The result is TRUE when the element is FALSE

x
[1]  TRUE  TRUE FALSE FALSE
!x
[1] FALSE FALSE  TRUE  TRUE

Priority of logic operation

Use parenthesis to give priority to one operation

! ( x & y)
[1] FALSE  TRUE  TRUE  TRUE
!x & y
[1] FALSE FALSE  TRUE FALSE
!x & !y
[1] FALSE FALSE FALSE  TRUE

They are all different

Functions over
Logic Vectors

Functions that return logic values

x
[1]  TRUE  TRUE FALSE FALSE

any(x) is TRUE if any element of x is TRUE

any(x)
[1] TRUE

all(x) is TRUE if all element of x are TRUE

all(x)
[1] FALSE

Function that return numbers

x
[1]  TRUE  TRUE FALSE FALSE

length(x) is the number of elements in x (size)

length(x)
[1] 4

sum(x) counts the number of x elements that are TRUE

sum(x)
[1] 2

Character values

that is, text

Handling text

Text is a string of characters (letters, symbols)

  • names, places
  • identifiers (kimlik)
    • ORCID, Student Number
  • nucleotides, amino-acids
  • file names, URL (web address)
  • organism’s name, locus
  • experimental condition

Text is wrapped in quotations marks

Each text must be between single or double quotes

"alpha"
[1] "alpha"
'beta'
[1] "beta"

You can use either ' or ", but you have to be coherent

Writing quotes inside quotes

Use single quotes inside double quotes, and vice-versa

"I don't know"
[1] "I don't know"
'he said "yes"' 
[1] "he said \"yes\""

Notice that " inside " is written as \"

Some special characters are coded with two symbols:
\", \\, \n, \t

Similar but not the same

They may look similar, but they are different

  • "TRUE" is not TRUE

  • "123" is not 123

Quotation marks always indicate text

No quotation marks, no text

You can store numbers in a character variable

When should you use numbers?

  • Does it make sense to add the values?
  • Is there an average?
  • Should we do arithmetic on the values?

When a number is not a number

  • Ages can be averaged, and it makes sense
  • Weight can be averaged, and it makes sense
  • Student numbers are never added
    • Moreover, student numbers start with 0
    • They are labels, whose symbols are digits

Store age and weight as numbers

Store student_number as text

Character vectors

Same idea as numbers and logic

Concatenation (using c()) create a vector

people <- c("Ali", "Deniz", "Fatma", "Emre", "Volkan", "Onur")
people
[1] "Ali"    "Deniz"  "Fatma"  "Emre"   "Volkan" "Onur"  

Functions over
Character Vectors

Functions that return numbers

people
[1] "Ali"    "Deniz"  "Fatma"  "Emre"   "Volkan" "Onur"  

length(x) is the number of elements in x (size)

length(people)
[1] 6

nchar(x) is the number of characters in each element

nchar(people)
[1] 3 5 5 4 6 4

Functions that return text vectors

substr() gives part of each element

substr(people, start=1, stop=3)
[1] "Ali" "Den" "Fat" "Emr" "Vol" "Onu"

We must tell which is the first and the last position of each substring

initial <- substr(people, start=1, stop=1)
initial
[1] "A" "D" "F" "E" "V" "O"

Pasting vectors

We can paste two or more vectors and get a new one

paste(people, initial)
[1] "Ali A"    "Deniz D"  "Fatma F"  "Emre E"   "Volkan V" "Onur O"  

Useful to combine given name and family name.

We can also collapse all elements into a single text

paste(people, collapse=", ")
[1] "Ali, Deniz, Fatma, Emre, Volkan, Onur"

Summary

  • Numbers represent things that we count or measure
  • Logic values represent yes or no answers
    • Is the plant alive?
    • Is the sample a positive control?
    • Did the student deliver the homework?
  • Character values represent names or identifiers
    • Student name
    • Student number