December 19, 2019

## This course is easy

### if you do the correct things

• First, you need to understand the question in your own language
• Make a drawing
• Decompose the problem in smaller parts
• You can eat an elephant, piece by piece
• Translate them to the computer language
• In this case, awk

## How many countries on each continent?

awk '$2=="americas" {n_americas++}$2=="africa" {n_africa++}
$2=="asia" {n_asia++}$2=="europe" {n_europe++}
END {print "americas", n_americas;
print "africa", n_africa;
print "asia", n_asia;
print "europe", n_europe;
}' world2017.txt

## Arrays make this easier

awk '$2=="americas" {n["americas"]++}$2=="africa" {n["africa"]++}
$2=="asia" {n["asia"]++}$2=="europe" {n["europe"]++}
END {print "americas", n["americas"];
print "africa", n["africa"];
print "asia", n["asia"];
print "europe", n["europe"];
}' world2017.txt

## $2 is the continent awk '{n[$2]++}
END {print "america", n["america"];
print "africa", n["africa"];
print "asia", n["asia"];
print "europe", n["europe"];
}' world2017.txt

## Repeat commands using for

awk '{n[\$2]++}
END {for(continent in n) {
print continent, n[continent]
}
}' world2017.txt

Notice that the output may not be in order

## Parts of an array

One array contains several elements

They are pairs of key, and values

We can access the values using []

we write the key inside []

we can read or write the value

array[key] =  value

## Exercise: Frequency table

### histogram

Using the int() function we can round the income per capita

What is the absolute frequency of income (in thousands of dollars)?

0 5
1 20
2 17
3 6
4 8
5 7
6 4
7 8
8 5
9 3
10 8
11 5
12 3
13 3
14 4
15 5
16 5
17 3
18 1
19 5
20 1
21 2
22 2
23 5
24 1
25 3
26 2
27 1
28 1
29 1
30 2
31 2
32 2
33 1
34 1
37 3
39 1
40 1
41 2
42 3
43 2
44 2
45 2
49 1
51 1
56 1
62 2
74 1
78 1
79 1
91 1
123 1