either numeric, logic, or text
The variables are decided before doing the experiment
The observations are found during the experiment
It is hard to add new columns in a text file
But it is very easy to add rows
Therefore we write observations as rows,
and variables as columns
One observation on each row
One variable on each column
Data enters the computer from instruments
Most modern instruments have digital output
In some cases it has to be entered manually
This is dangerous, humans make many mistakes
For us, data always comes from another program
There are several file formats used to store data tables
The most common are
For now, we work with tab- and comma-separated values
Today we will use data from
http://www.dry-lab.org/static/2020/ cmb1/students2018-2020.tsv
Take a look at it
What can you say about it?
The classical way to read this data is using
Environment → Import Dataset → From text (base)
which corresponds to the command
(you can load data with the menu or the keyboard)
    answer_date     id                              english_level    sex
1    2018-09-17 3e501d                       I can speak fluently   Male
2    2018-09-17 479d88  I can understand movies without subtitles Female
3    2018-09-17 39df0d I can read and understand technical papers Female
4    2018-09-17 d2b091 I can read and understand technical papers   Male
5    2018-09-17 f22b12 I can read and understand technical papers Female
6    2018-09-17 849c75                       İngilizce bilmiyorum Female
7    2018-09-17 83812b                       I can speak fluently Female
8    2018-09-17 b0dde9 I can read and understand technical papers   Male
9    2018-09-17 297223 I can read and understand technical papers Female
10   2018-09-17 72c073 I can read and understand technical papers Female
11   2018-09-17 d29251 I can read and understand technical papers   Male
12   2018-09-17 6f0831 I can read and understand technical papers Female
13   2018-09-17 75b355 I can read and understand technical papers Female
14   2018-09-17 0b0da7 I can read and understand technical papers Female
15   2018-09-17 352b9f I can read and understand technical papers Female
16   2018-09-17 6f28ac I can read and understand technical papers Female
17   2018-09-17 ee5ef4 I can read and understand technical papers Female
18   2018-09-17 ba52ec I can read and understand technical papers   Male
19   2018-09-17 9d98b6 I can read and understand technical papers Female
20   2018-09-17 f92274                       I can speak fluently Female
21   2018-09-17 1c7531 I can read and understand technical papers Female
22   2018-09-17 8c9730  I can understand movies without subtitles   Male
23   2018-09-18 371f15 I can read and understand technical papers Female
24   2018-09-18 52766e I can read and understand technical papers Female
25   2018-09-18 644c22 I can read and understand technical papers Female
26   2018-09-18 df8cf1 I can read and understand technical papers Female
27   2018-09-18 c0bd32  I can understand movies without subtitles Female
28   2018-09-19 ddbc78                       İngilizce bilmiyorum Female
29   2018-09-19 6c394f  I can understand movies without subtitles   Male
30   2018-09-19 9fb139                       İngilizce bilmiyorum Female
31   2018-09-20 70bd4d I can write poetry better than Shakespeare   Male
32   2018-09-20 567104 I can read and understand technical papers Female
33   2018-09-20 b2571a I can read and understand technical papers Female
34   2018-09-20 dcc268 I can read and understand technical papers   Male
35   2018-09-20 ac1b6f  I can understand movies without subtitles   Male
36   2018-09-20 89cd86                       I can speak fluently   Male
37   2018-09-20 ba5f4b I can read and understand technical papers Female
38   2018-09-20 ba5f4b I can read and understand technical papers Female
39   2018-09-21 b45951                       İngilizce bilmiyorum   Male
40   2018-09-21 c6208d I can read and understand technical papers   Male
41   2018-09-23 412ea2  I can understand movies without subtitles Female
42   2018-09-24 b741bc I can read and understand technical papers Female
43   2018-09-24 715173 I can read and understand technical papers Female
44   2018-09-24 bc23db I can read and understand technical papers   Male
45   2018-09-24 e9d1f5 I can read and understand technical papers   Male
46   2018-09-24 08d7a1              English is my native language Female
47   2018-09-24 08d7a1              English is my native language Female
48   2018-09-24 219959  I can understand movies without subtitles Female
49   2018-09-24 383ce5                       İngilizce bilmiyorum Female
50   2018-09-24 7b5198                       I can speak fluently Female
51   2018-09-24 68efdf I can read and understand technical papers Female
52   2018-09-24 7afb3f                       İngilizce bilmiyorum   Male
53   2018-09-24 cbda9b I can read and understand technical papers   Male
54   2018-09-24 3a597c                       I can speak fluently   Male
55   2018-09-24 cd7205 I can read and understand technical papers   Male
56   2018-09-24 dcaf3d  I can understand movies without subtitles   Male
57   2018-09-24 dcaf3d  I can understand movies without subtitles   Male
58   2018-09-29 70de11 I can read and understand technical papers Female
59   2018-10-04 b43e2b I can read and understand technical papers   Male
60   2018-10-06 3b85c4  I can understand movies without subtitles Female
61   2018-10-08 6961a2  I can understand movies without subtitles   Male
62   2018-10-09 0dd83b I can read and understand technical papers   <NA>
63   2018-10-11 213231                       I can speak fluently Female
64   2018-10-11 998d64                       İngilizce bilmiyorum   Male
65   2018-10-15 008c4d  I can understand movies without subtitles   Male
66   2018-11-07 7955ff                       I can speak fluently   Male
67   2018-11-09 a896b2 I can read and understand technical papers Female
68   2019-09-25 b2571a I can read and understand technical papers Female
69   2019-09-27 68a1cf                       İngilizce bilmiyorum Female
70   2019-09-27 dbf5bc I can read and understand technical papers Female
71   2019-09-29 a7ff02                       İngilizce bilmiyorum Female
72   2019-10-01 cbda9b I can read and understand technical papers   Male
73   2019-10-07 3a597c                       I can speak fluently   Male
74   2019-10-09 213231                       I can speak fluently Female
75   2019-10-09 1e2e83  I can understand movies without subtitles   Male
76   2019-10-11 a45fe6                       İngilizce bilmiyorum Female
77   2019-10-14 6961a2  I can understand movies without subtitles   Male
78   2019-10-14 7b5198                       I can speak fluently Female
79   2019-10-14 68efdf I can read and understand technical papers Female
80   2019-10-15 08d7a1              English is my native language Female
81   2020-10-19 70f3de                       I can speak fluently Female
82   2020-10-19 b81bd1  I can understand movies without subtitles Female
83   2020-10-19 692637  I can understand movies without subtitles Female
84   2020-10-19 42c891              English is my native language   Male
85   2020-10-19 242bf7  I can understand movies without subtitles Female
86   2020-10-19 cd7205                       I can speak fluently   Male
87   2020-10-19 f8d60d                       I can speak fluently Female
88   2020-10-19 47e2e0 I can read and understand technical papers Female
89   2020-10-19 50988d I can read and understand technical papers Female
90   2020-10-19 60a92f I can read and understand technical papers Female
91   2020-10-19 432cf7                       I can speak fluently   Male
92   2020-10-19 9bba74 I can read and understand technical papers Female
93   2020-10-19 a7ff02 I can read and understand technical papers Female
94   2020-10-19 5012ed I can read and understand technical papers   Male
95   2020-10-19 91e5e8  I can understand movies without subtitles Female
96   2020-10-19 fe26f8  I can understand movies without subtitles Female
97   2020-10-19 4f5875                       I can speak fluently Female
98   2020-10-19 52b150  I can understand movies without subtitles Female
99   2020-10-21 d29251 I can read and understand technical papers   Male
100  2020-10-21 849c75                       İngilizce bilmiyorum Female
101  2020-10-21 c9a95d I can read and understand technical papers Female
102  2020-10-21 2f4b15 I can read and understand technical papers Female
103  2020-10-22 3fe6b5 I can read and understand technical papers Female
104  2020-10-22 412ea2  I can understand movies without subtitles Female
105  2020-10-23 a45fe6 I can read and understand technical papers Female
106  2020-10-23 287c3a  I can understand movies without subtitles Female
107  2020-10-24 6961a2  I can understand movies without subtitles   Male
108  2020-10-24 6961a2  I can understand movies without subtitles   Male
109  2020-10-26 6e5137                       I can speak fluently Female
110  2020-10-26 3a597c                       I can speak fluently   Male
111  2020-10-26 f5dafd I can read and understand technical papers Female
112  2020-11-05 242bf7  I can understand movies without subtitles Female
113  2020-11-05 91e5e8 I can read and understand technical papers Female
114  2020-11-05 60a92f I can read and understand technical papers Female
115  2020-11-05 b041ba  I can understand movies without subtitles   Male
116  2020-11-06 c9b8b1                       İngilizce bilmiyorum Female
117  2020-11-06 68a1cf I can read and understand technical papers Female
     birthdate             birthplace height_cm weight_kg handness hand_span
1   1993-02-01                 turkey    179.00      67.0    Right      15.0
2   1998-05-21          Kahramanmaraş      1.68      55.0    Right      14.0
3   1998-01-18        Batman, Türkiye        NA        NA    Right      18.0
4   1998-08-29         Antalya,Turkey    170.00      74.0    Right      25.0
5   1998-05-03                  izmir    162.00      68.0    Right      13.0
6   1995-10-09       Türkiye / Yalova    167.00      58.0    Right      18.0
7   1997-09-19        Adıyaman,Turkey    174.00      72.0    Right      16.0
8   1997-11-27                  Bursa    180.00      68.0    Right      19.0
9   1999-01-02       İstanbul/Türkiye    162.00      58.0    Right      19.0
10  1998-10-02        İstanbul,Turkey    172.00      55.0    Right      20.0
11  1997-05-18             VAN/TURKEY    181.00      81.0    Right      20.0
12  1997-12-08                   <NA>        NA        NA    Right      20.0
13  1997-10-13           Sümeyye Onat    155.00      42.5    Right      20.0
14  1998-02-03               Istanbul        NA        NA    Right      30.0
15  1998-06-10               İstanbul      1.59      69.0    Right      18.0
16  1998-05-17        Samsun, Türkiye    165.00      58.0    Right      19.0
17  1997-07-07          Mardin,Turkey    166.00      47.0    Right      20.0
18  1998-10-13       gaziantep turkey    182.00      78.0    Right      21.0
19  1998-06-09        İstanbul,Turkey    158.00      57.0    Right      19.0
20  2018-09-03        Yıldırım, BURSA      1.64      55.0    Right      20.0
21  1998-09-17        Istanbul/Turkey    173.00      55.0    Right       8.0
22  1998-07-28         Bursa / TURKEY    185.00      65.0     Left      22.0
23  1998-08-17                 Yalova    163.00      60.0    Right      15.0
24  1998-03-24            Ordu Turkey    167.00      50.0    Right      30.0
25  2018-04-24       Istanbul, Turkey        NA        NA    Right      19.0
26  1997-10-13               İstanbul    171.00      52.0    Right      25.0
27  1997-05-18        Edirne, Türkiye    165.00      54.0    Right      18.0
28  1997-01-14       Malatya, Türkiye    162.00      75.0     Left      18.0
29  1997-06-25                   <NA>    188.00     105.0    Right      20.0
30  1995-01-28  Türkiye/Hatay/Antakya      1.70      56.0     Left      18.0
31  2018-12-08               istanbul        NA        NA    Right      20.0
32  1997-07-03                  Çorum    160.00      50.0    Right      15.0
33  1996-01-04               İstanbul        NA        NA     Left      15.0
34  1997-01-05           Muğla/Turkey    178.00      67.0    Right      24.0
35  1997-12-26                   City    176.00      59.0    Right      24.0
36  1998-10-31       Istanbul, TURKEY    184.00      75.0    Right      22.0
37  1991-01-01                 Suriye    160.00      60.0    Right      19.0
38  1991-01-01                 Suriye    160.00      60.0    Right      19.0
39  1998-01-10        Yıldırım, Bursa    175.00     106.0    Right      15.0
40  1992-08-11          Malatya/Turky      1.80      94.0    Right      25.0
41  1999-05-02              Balıkesir    165.00      63.0     Left      17.0
42  1997-07-29       Istanbul/Türkiye      1.60      54.0    Right      20.0
43  1998-02-05  Nakhchivan/Azerbaijan      1.57      53.0    Right      20.0
44  1998-11-19             Azerbaijan    175.00      75.0    Right      20.0
45  1997-02-09           Sivas,Turkey    183.00      70.0    Right      20.0
46  1997-06-30                 Ankara    158.00      65.0    Right       8.0
47  1997-06-30                 Ankara    158.00      65.0    Right       8.0
48  1998-09-03                 Samsun    174.00      55.0    Right      22.0
49  1998-11-16          Adana,türkiye    163.00      68.0    Right      13.0
50  1999-05-23     Almaty, Kazakhstan    178.00      55.0    Right      12.0
51  1998-04-07               istanbul    165.00        NA    Right       9.0
52  1997-05-01        Antalya/Türkiye    173.00      80.0    Right      16.0
53  1996-09-26           Hatay/Turkey    175.00      77.0    Right      18.0
54  1993-03-14      Tekirdag / Turkey    195.00      85.0    Right      30.0
55  1997-12-06                 turkey    166.00      65.0    Right      15.0
56  1998-11-06           İzmir-Turkey    163.00      64.0    Right      15.0
57  1998-11-06           İzmir-Turkey    163.00      64.0    Right      15.0
58  1998-09-01             Van,Turkey    174.00      60.0    Right      24.0
59  2018-01-15          Bursa,türkiye    175.00      76.0    Right      20.0
60  1996-04-05         Tunceli,Turkey    173.00      56.0    Right      21.0
61  1994-01-01                 Aleppo        NA      78.0    Right      25.0
62        <NA>                   <NA>        NA        NA    Right      22.0
63        <NA>                   <NA>        NA        NA    Right      17.0
64  1996-03-09               İstanbul    177.00      77.0    Right      23.0
65  1996-10-25     Safranbolu/KARABUK    181.00      72.0     Left      26.0
66  1994-01-05                   <NA>        NA        NA    Right      25.0
67  1998-04-18               İstanbul    165.00      58.0    Right      20.5
68  1996-01-04               İstanbul        NA        NA     Left      20.0
69  1995-03-26                 YALOVA    168.00      66.0    Right      18.0
70  1994-08-18    Edremit (Balıkesir)      1.64      52.0    Right      19.0
71  1997-03-23           Turkmenistan    179.00        NA    Right      18.0
72  1996-09-26          Hatay/Antakya    175.00      73.0    Right      20.0
73  1993-03-14        Tekirdağ/Turkey    195.00      82.0    Right      25.0
74  2019-06-06               İstanbul    160.00      55.0    Right      17.0
75  1996-10-25               İstanbul    180.00      86.0    Right      23.0
76  1997-02-03                  Sivas    161.00      63.0    Right      18.0
77  1994-01-01           Aleppo/Syria    183.00      85.0    Right      22.0
78  1999-05-23     Almaty, Kazakhstan    178.00      58.0    Right      21.0
79  1998-04-07               istanbul    165.00      65.0    Right      20.0
80  1997-06-30                 Ankara    158.00      65.0    Right      14.0
81  2000-11-07         Konya, Türkiye    165.00      70.0    Right      18.0
82  2001-12-25         Afyon, Türkiye    169.00        NA    Right      21.0
83  1999-05-23       Antalya, Türkiye    167.00      47.0    Right      20.0
84  1994-01-05               tekirdag      1.80      82.0    Right      21.0
85  2001-11-01      İstanbul, Türkiye    162.00      70.0     Left      16.0
86  1997-06-12             Kırklareli    169.00      75.0    Right      20.0
87  1998-02-20          Aydın, Turkey    165.00      47.0    Right      21.0
88  1997-07-24        İstanbul,Turkey    168.00      72.0    Right      21.0
89  2000-12-28      Hannover, Germany    171.00        NA    Right      18.0
90  1998-12-28       Istanbul/ Turkey    171.00      61.0    Right      21.0
91  2001-07-04         Mersin, Turkey    184.00      79.0    Right      25.0
92  2000-01-22          TÜrkiye/Bursa    165.00      55.0    Right      14.0
93  1997-03-23           Turkmenistan    179.00        NA    Right      21.0
94  1999-10-29           Bodrum/Muğla    180.00      74.0     Left      23.0
95  2000-07-26 Afyonkarahisar, Turkey    164.00      47.0    Right      19.0
96  2000-04-15       Istanbul/ Turkey    156.00      54.0    Right      15.0
97  1998-01-21               Istanbul        NA        NA    Right      19.0
98  2000-12-06            Ordu/Turkey      1.63      60.0     Left      19.0
99  1997-05-18           VAN / TURKEY    183.00      74.0    Right      19.5
100 1995-10-09     OSMANGAZİ, TÜRKİYE    167.00      56.0    Right      17.0
101 1996-08-14         Manisa/ Turkey        NA        NA    Right      18.0
102 1998-08-02       Turkey /İstanbul      1.75      65.0    Right      20.0
103 1999-03-21       Istanbul, Turkey    162.00      49.0    Right      17.0
104 1999-05-02                 Turkey    168.00      63.0     Left      18.0
105 1997-02-03           Sivas,Turkey    161.00      65.0    Right      18.0
106 1999-06-22      İstanbul, Türkiye    165.00      47.0    Right      18.0
107 1994-01-01               istanbul    184.00      90.0    Right      23.0
108 1994-01-01               istanbul    184.00      90.0    Right      23.0
109 2001-08-01       Istanbul/ Turkey    162.00      76.0    Right      24.0
110 1993-03-14       Tekirdağ, Turkey    195.00      88.0    Right      24.0
111 1977-03-08               İstanbul    167.00      80.0    Right      22.0
112 2001-11-01       İstanbul/Türkiye    162.00      72.0     Left      16.0
113 2000-07-26 Afyonkarahisar, Turkey    164.00      47.0    Right      19.0
114 1998-12-28               İstanbul    171.00      61.0    Right      21.0
115 1991-11-15               Istanbul    192.00      95.0    Right      26.0
116 1996-01-18        istanbul,turkey    168.00      67.0    Right      21.0
117 1995-03-26       YALOVA / TÜRKİYE    168.00      80.0    Right      15.0
Bidimensional structures
Each column can be of a different type
All columns have the same length
All columns need a name
Usually too big to print
How can we see survey
In Rstudio we can use the command
But this does not work on Rmarkdown,
so we cannot use it in a paper or report
  answer_date     id                              english_level    sex
1  2018-09-17 3e501d                       I can speak fluently   Male
2  2018-09-17 479d88  I can understand movies without subtitles Female
3  2018-09-17 39df0d I can read and understand technical papers Female
4  2018-09-17 d2b091 I can read and understand technical papers   Male
5  2018-09-17 f22b12 I can read and understand technical papers Female
6  2018-09-17 849c75                       İngilizce bilmiyorum Female
   birthdate       birthplace height_cm weight_kg handness hand_span
1 1993-02-01           turkey    179.00        67    Right        15
2 1998-05-21    Kahramanmaraş      1.68        55    Right        14
3 1998-01-18  Batman, Türkiye        NA        NA    Right        18
4 1998-08-29   Antalya,Turkey    170.00        74    Right        25
5 1998-05-03            izmir    162.00        68    Right        13
6 1995-10-09 Türkiye / Yalova    167.00        58    Right        18
Notice that there are too many columns
One basic question we need to answer is how many observations are in our data frame
In other words, we want to know the number of rows
Use the command
[1] 117
We also want to know what is the number of columns
[1] 10
Together, the number of rows and columns is called dimension
[1] 117  10
Each column represents a variable
The column name is the name of the variable
 [1] "answer_date"   "id"            "english_level" "sex"          
 [5] "birthdate"     "birthplace"    "height_cm"     "weight_kg"    
 [9] "handness"      "hand_span"    
You can use $ to get the vector on each column
  [1]  67.0  55.0    NA  74.0  68.0  58.0  72.0  68.0  58.0  55.0  81.0    NA
 [13]  42.5    NA  69.0  58.0  47.0  78.0  57.0  55.0  55.0  65.0  60.0  50.0
 [25]    NA  52.0  54.0  75.0 105.0  56.0    NA  50.0    NA  67.0  59.0  75.0
 [37]  60.0  60.0 106.0  94.0  63.0  54.0  53.0  75.0  70.0  65.0  65.0  55.0
 [49]  68.0  55.0    NA  80.0  77.0  85.0  65.0  64.0  64.0  60.0  76.0  56.0
 [61]  78.0    NA    NA  77.0  72.0    NA  58.0    NA  66.0  52.0    NA  73.0
 [73]  82.0  55.0  86.0  63.0  85.0  58.0  65.0  65.0  70.0    NA  47.0  82.0
 [85]  70.0  75.0  47.0  72.0    NA  61.0  79.0  55.0    NA  74.0  47.0  54.0
 [97]    NA  60.0  74.0  56.0    NA  65.0  49.0  63.0  65.0  47.0  90.0  90.0
[109]  76.0  88.0  80.0  72.0  47.0  61.0  95.0  67.0  80.0
This data is real, and belongs to you
To use it here, we deleted some of your personal data
It does not show your name, email or student number
Instead, there is an id column, unique to each person
[1] "3e501d" "479d88" "39df0d" "d2b091" "f22b12" "849c75"
The id column was created using a digital signature
(we discuss them in class 14)
Same id is always same person. But privacy is preserved
This is one step to do a blind analysis
It is essential to keep anonymity of patients data
And to avoid researcher bias
As with vectors, we want to choose which parts to see
We can use logic values to filter the rows
For example, we may want to know about left-handed people attending to our course this year
For example, we can do this
    answer_date     id                              english_level    sex
85   2020-10-19 242bf7  I can understand movies without subtitles Female
94   2020-10-19 5012ed I can read and understand technical papers   Male
98   2020-10-19 52b150  I can understand movies without subtitles Female
104  2020-10-22 412ea2  I can understand movies without subtitles Female
112  2020-11-05 242bf7  I can understand movies without subtitles Female
     birthdate        birthplace height_cm weight_kg handness hand_span
85  2001-11-01 İstanbul, Türkiye    162.00        70     Left        16
94  1999-10-29      Bodrum/Muğla    180.00        74     Left        23
98  2000-12-06       Ordu/Turkey      1.63        60     Left        19
104 1999-05-02            Turkey    168.00        63     Left        18
112 2001-11-01  İstanbul/Türkiye    162.00        72     Left        16
Here we do not need to write survey$
but…
In the last years people has improved data frames to make them easier to use
The new version is called tibble

The easiest way to load data is to use the menu
Environment → Import Dataset → From Text (readr)…
── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
cols(
  answer_date = col_date(format = ""),
  id = col_character(),
  english_level = col_character(),
  sex = col_character(),
  birthdate = col_date(format = ""),
  birthplace = col_character(),
  height_cm = col_double(),
  weight_kg = col_double(),
  handness = col_character(),
  hand_span = col_double()
)
(we will explain library(readr) later)
# A tibble: 117 x 10
   answer_date id    english_level sex   birthdate  birthplace height_cm
   <date>      <chr> <chr>         <chr> <date>     <chr>          <dbl>
 1 2018-09-17  3e50… I can speak … Male  1993-02-01 turkey        179   
 2 2018-09-17  479d… I can unders… Fema… 1998-05-21 Kahramanm…      1.68
 3 2018-09-17  39df… I can read a… Fema… 1998-01-18 Batman, T…     NA   
 4 2018-09-17  d2b0… I can read a… Male  1998-08-29 Antalya,T…    170   
 5 2018-09-17  f22b… I can read a… Fema… 1998-05-03 izmir         162   
 6 2018-09-17  849c… İngilizce bi… Fema… 1995-10-09 Türkiye /…    167   
 7 2018-09-17  8381… I can speak … Fema… 1997-09-19 Adıyaman,…    174   
 8 2018-09-17  b0dd… I can read a… Male  1997-11-27 Bursa         180   
 9 2018-09-17  2972… I can read a… Fema… 1999-01-02 İstanbul/…    162   
10 2018-09-17  72c0… I can read a… Fema… 1998-10-02 İstanbul,…    172   
# … with 107 more rows, and 3 more variables: weight_kg <dbl>, handness <chr>,
#   hand_span <dbl>
This is much easier to read
These commands work in tibbles as in data frames
[1] 117  10
[1] 117
[1] 10
As before, we can ask for column names
 [1] "answer_date"   "id"            "english_level" "sex"          
 [5] "birthdate"     "birthplace"    "height_cm"     "weight_kg"    
 [9] "handness"      "hand_span"    
Each column can be accessed by its name
 Left Right 
   12   105 
What is the height of left-handed people?
To answer this question, we need new tools
Let’s get new tools for our R
library(readr)?Remember how we read data from the file
Now we will explain library(readr):
We use it to enable the read_tsv() command
Out of the box, your R system has many commands
But there are more commands, that you can also use
The new commands are in packages or libraries
To enable a package, we use the command library()

library() with installed packagesIf you click on the package name, you can see what are its commands
To use them, write library(package name)
You need to do this once in every session
What if you need more packages?
If the package is not in your computer,
you need to use install.packages()
This command download new packages from the web
We install only one time
We load every time we need them
You can use the menu Packages → Install

To work with tibbles we need to install several packages
This set of packages is called tidyverse
In the command line, you write
This command will download all the packages
and store them in your computer
You only need to do this one time.
We will use several packages from tidyverse
There is a lot of material free online
Read it. Watch it
Today we use only the dplyr package
dplyr package
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
    filter, lag
The following objects are masked from 'package:base':
    intersect, setdiff, setequal, union
Do not pay attention to the warning messages
We will deal with them later
We can easily choose the relevant rows
# A tibble: 5 x 10
  answer_date id    english_level sex   birthdate  birthplace height_cm
  <date>      <chr> <chr>         <chr> <date>     <chr>          <dbl>
1 2020-10-19  242b… I can unders… Fema… 2001-11-01 İstanbul,…    162   
2 2020-10-19  5012… I can read a… Male  1999-10-29 Bodrum/Mu…    180   
3 2020-10-19  52b1… I can unders… Fema… 2000-12-06 Ordu/Turk…      1.63
4 2020-10-22  412e… I can unders… Fema… 1999-05-02 Turkey        168   
5 2020-11-05  242b… I can unders… Fema… 2001-11-01 İstanbul/…    162   
# … with 3 more variables: weight_kg <dbl>, handness <chr>, hand_span <dbl>
(notice that we use == for comparisons)
# A tibble: 117 x 2
   weight_kg height_cm
       <dbl>     <dbl>
 1        67    179   
 2        55      1.68
 3        NA     NA   
 4        74    170   
 5        68    162   
 6        58    167   
 7        72    174   
 8        68    180   
 9        58    162   
10        55    172   
# … with 107 more rows
We can use the result of this comparison as a row index
left_handed <- filter(students, handness=="Left" & answer_date > "2020-01-01")
select(left_handed, answer_date, weight_kg, height_cm)# A tibble: 5 x 3
  answer_date weight_kg height_cm
  <date>          <dbl>     <dbl>
1 2020-10-19         70    162   
2 2020-10-19         74    180   
3 2020-10-19         60      1.63
4 2020-10-22         63    168   
5 2020-11-05         72    162   
Normally we use <- for assignment
There is another way, that is sometimes nicer
The -> arrow goes from the value to the variable
filter(students, handness=="Left" & answer_date > "2020-01-01") -> left_handed 
select(left_handed, answer_date, weight_kg, height_cm)# A tibble: 5 x 3
  answer_date weight_kg height_cm
  <date>          <dbl>     <dbl>
1 2020-10-19         70    162   
2 2020-10-19         74    180   
3 2020-10-19         60      1.63
4 2020-10-22         63    168   
5 2020-11-05         72    162   
left_handed is an intermediate variable
We use it only for one step. We don’t need it at the end
filter(students, handness=="Left" & answer_date > "2020-01-01") %>% select(answer_date, weight_kg, height_cm)# A tibble: 5 x 3
  answer_date weight_kg height_cm
  <date>          <dbl>     <dbl>
1 2020-10-19         70    162   
2 2020-10-19         74    180   
3 2020-10-19         60      1.63
4 2020-10-22         63    168   
5 2020-11-05         72    162   
The key thing is %>%, called pipe
filter(students, handness=="Left" & answer_date > "2020-01-01") %>% 
    select(answer_date, weight_kg, height_cm)# A tibble: 5 x 3
  answer_date weight_kg height_cm
  <date>          <dbl>     <dbl>
1 2020-10-19         70    162   
2 2020-10-19         74    180   
3 2020-10-19         60      1.63
4 2020-10-22         63    168   
5 2020-11-05         72    162   
If you write %>% at the end of the line, you can continue in the next line
The %>% symbol help us to write clear code.
Instead of
we write
The first function input is taken from the pipe
Instead of
we write
We can read %>% as “then”
“Take x, then calculate sine, then square root, then take the smallest of the result and z, and store it in y”
The package providing pipes is called magrittr
Why?
Tell me in the next class
(no writing necessary)