September 20th, 2018

## What is a computer?

• Is a general purpose device
• that can read, process and write numbers
• (and things that can be represented by numbers)
• to and from the memory
• following a program stored also in the memory
• many simple steps
• Changing the program changes the purpose of the machine

## Hardware and Software

Since old times physical tools are called hardware

That includes al the physical parts of the computer (what you can kick)

Programs determine the function of the computer, but they are not “physical”.

That is software (what you can only insult)

## Biological analogy

All cell components are hardware

The sequence of the DNA is the software

## CPU

The processor or central processing unit (“CPU”) is the brains of the computer

• does arithmetic,
• moves data around,
• controls the operation of the other parts
• can decide what to do next based on the previous results

CPU can do only a few things, and it does them very fast

## RAM: random access memory

The primary memory or random access memory

• stores information that is in active use
• the data that the CPU is currently working on,
• the instructions that tell the CPU what to do
• its contents can be changed by the CPU

## RAM is volatile

• Its contents disappear if the power is turned off
• and all this currently active information is lost

That’s why it’s prudent to save your work often

Electric problems can be a real disaster

## Your computer has a finite amount of RAM

You can think of the RAM as

• a large collection of identical little boxes
• numbered from 1 up to 1000000000
• each box can hold a small amount of information.

Capacity is measured in bytes

What is the capacity of your computer?

## One byte = One character

For technical reasons modern computers handle memory in bytes

• memory big enough to hold a single character
• like W or @
• Or a small integer number (0 to 255)
Byte = Integer between 0 and 255 = One letter

## Floating point

Numbers with decimals can be represented using scientific notation $1.8466 \cdot 10^{19}$

In the computer we write 1.8466E19

## Can also represent special values

• Inf: Positive Infinity, 1/0
• -Inf: Negative Infinity, -1/0
• NaN: Not a number, 0/0
• NA: Not Available, missing data

## This has some limitations

We have a fixed number of digits

Not all numbers are represented exactly

For example $\frac{1}{3}=0.33333333\cdots$ cannot not be represented exactly with 10 digits

## Example: Sound

• Sound is transformed into electricity by a microphone.
• The voltage is measured 44100 times each second
• Each sample is stored as a number in a CD

Two steps: sampling (in time) and discretization (in voltage)

## Example: Greyscale Image

• Each “point” has a value between 0 (black) and 255 (white)
• correct name is pixel picture element
• they are stored line by line

When they don’t have energy, they forget all

All data must be stored in secondary memory

Today secondary memory is

• hard disk
• USB stick
• Cloud storage

## Disks and secondary storage

• The RAM is expensive, so we it is not too big
• its contents disappear when the power is turned off
• Secondary storage holds data even when the power is turned off
• The most common kind are magnetic disks
• also called the hard disk or hard drive
• data on the disk stays there indefinitely
• even if power fails

## Secondary storage is slow

Data, instructions, and everything else is stored on the disk for the long term

And brought into RAM only for a short time

Disk space is about 100 times cheaper than RAM

But accessing information is much slower.

## Homework: Memory size

1. What is the capacity of the memory of your computer?

2. What is the capacity of the disk?

## Structure of secondary memory

The disks store a huge amount of data

To organize it we use files

To organize the files we use folders
also called directories

## Files

Like the main memory, a file is just a list of bytes

The meaning of the file depends on the context

You can decide to change their meaning

Most of the times, the name of the file suggests a context

For example, an MP3 file is probably audio

## File attributes

Besides the data itself, files have metadata

That is, data about the data. For example

• Files have a name
• Files have a modification date, maybe other dates too
• Files have a size
• Files have permissions

## File names

The names of the files are “words”: a series of letters, numbers and some symbols

Technically, a filenames is a String or list of characters

Maximum length of a filename is 250 characters

Avoid /, :, +, |, <, *, >, " and '

Use letters (A-Z, a-z), numbers (0-9), ., -,   and _

## File names

In some systems small caps and BIG CAPS are not equivalent. Be systematic and coherent

If the filename includes ., the text after it is called extension

In Microsoft Windows (c) extensions are usually 3 letters

• EXE, JPG, DOC, XLS, TXT, CSV
• These are hints on how to interpret the file

## Kinds of file

At low level there is only one type of file

For us, it is useful to separate in two:

Text Files
each byte is a character, we can read it
Binary Files
bytes are grouped in binary numbers, representing images, sounds, etc.

Among binary files we have EXE files, which are programs for Windows

## Kinds of file

At low level there is only one type of file

For us, it is useful to separate in two:

Text Files:
each byte is a character, we can read it
Binary Files:
bytes are grouped in binary numbers, representing images, sounds, etc.

Among binary files we have EXE files, which are programs for Windows

## Representing text

The most natural way to represent a text document is to encode each letter with a single byte

There is a basic standard for English, called ASCII

Each number from 0 to 127 is either a symbol or a special signal

• New Line
• End of Message
• Tab
• Space
• Backspace

## ASCII code

30 40 50 60 70 80 90 100 110 120
0 ( 2 < F P Z d n x
1 ) 3 = G Q [ e o y
2 4 > H R \ f p z
3 ! + 5 ? I S ] g q {
4 " , 6 @ J T ^ h r |
5 # - 7 A K U i s }
6 \$ . 8 B L V  j t ~
7 % / 9 C M W a k u
8 & 0 : D N X b l v
9 ´ 1 ; E O Y c m w

Non-English languages use numbers between 128 and 255 for symbols like “Ç”, “Ö”, “É”, “Ñ”

## Text Files

• are universal
• are easy to read and write from a program
• do not have any style like bold or italic
• are like books without figures

Microsoft Word files (doc or docx`) are NOT text files

You shall not use Word for this course

## Text files are for humans and computers

• Binary files are hard to read
• unless you have the correct program
• Text files can be read by humans
• Each byte is a letter
• Text files can be read by computers
• Data must be recyclable
• The output of one program is the input of another program

## Are computers helping us?

To many people, computers are not helping. Instead they feel like computers make things harder.

The same happened when electric engines were invented

## Doing the same thing gives the same results

Just changing the technology does not change the world

The real change happens when we do things in a different way

## Computers are not Typewriters

If we only replace typewriters by Word Processors, nothing changes

Microsoft Word is a technology for 19th century

We need a new way to use computers

## Next Class

• Structured documents
• Markdown
• RStudio