Class 2

Welcome back

to “Computing for Molecular Biology 1”

Last week we talked about

why computing is important to us
what is a computer
what a computer can do
some parts of a computer
some strategies for effective learning

Can you tell something about this?

NCBI

Information

What can be represented by numbers?

The smallest information piece

The most simple answer to a question is yes or no

When we bet on a tossed coin, what do we know?

This elementary information unit is called bit
(binary digit)

It can be represented by on/off, true/false, 0/1, etc.

Binary representation

For technical reasons modern computers handle only packs of 8 bits

That is called a byte and can represent a number in the range 0 to 255

Using two bytes we can represent numbers between 0 and 65535

How? If $x$ and $y$ are two bytes, we can evaluate \[x+256 y\]

Bigger numbers

binary representation

The idea can be extended to using 4 bytes

0 to 4 294 967 295

and also to using 8 bytes

From 0 to 18 446 744 073 709 551 615 \[1.8466 \cdot 10^{19}\]

Signed numbers

With a small modification we can also represent negative numbers.

For example, a number $x$ between -128 and 127 can be represented as $x+128$

Note: in practice we use a similar but different encoding

Positive and Negative Integers can be represented in Binary

Example: Sound

Sound is transformed into electricity by a microphone.
The voltage is measured 44100 times each second
Each sample is stored as a number in a CD

Two steps: sampling (in time) and discretization (in voltage)

Example: Greyscale Image

cats

Example: Greyscale Image

Each “point” has a value between 0 (black) and 255 (white)
correct name is pixel picture element
they are stored line by line

greyscale

Floating point

Using scientific notation we write \[1.8466 \cdot 10^{19}\] Using the same idea we can use two numbers like this $x\cdot 2^y$

There are two versions: single and double precision

They use 4 and 8 bytes, respectively

Floating point standard

Notice that this approach has some limitations

Not all numbers are represented exactly

Can also represent special values

Inf: Positive Infinity, 1/0
-Inf: Negative Infinity, -1/0
NaN: Not a number, 0/0
NA: Not Available, missing data

Two kinds of memory

The computer has memory to store information. These are electronic devices

When the computer is turned off, the information is lost

We need to copy information to a secondary storage

Secondary storage examples

hard disk
flash disk (USB stick)
cloud storage
diskette/floppy disk
zip disk
tape
punched cards

Memory size

How much can we store in the computer?

What is the size of the memory of your computer?

What is the size of the disk?

The memory (RAM) is like a desk.
The disk is like a book shelf.

Representing text

The most natural way to represent a text document is to encode each letter with a single byte

There is a basic standard for English, called ASCII

Each number from 0 to 127 is either a symbol or a special signal, such as

New Line
End of Message
Tab
Space
Backspace

ASCII code

	30	40	50	60	70	80	90	100	110	120
0		(	2	<	F	P	Z	d	n	x
1		)	3	=	G	Q	\[\| e \| o \| y 2\| \| \| 4\| >\| H\| R\|\\\| f \| p \| z 3\| !\| +\| 5\| ?\| I\| S\|\]	g	q	{
4	"	,	6	@	J	T	^	h	r	\|
5	#	-	7	A	K	U		i	s	}
6	$	.	8	B	L	V	`	j	t	~
7	%	/	9	C	M	W	a	k	u
8	&	0	:	D	N	X	b	l	v
9	´	1	;	E	O	Y	c	m	w

Numbers between 128 and 255 are not used in ASCII

Non English languages use these values for symbols like “Ç”, “Ö”, “É”, “Ñ”

Text Files

are universal
are easy to read and write from a program
do not have any style like bold or italic
are like books without figures

Microsoft Word files (doc or docx) are NOT text files

Thou shall not use Word for this course