“*Vertices* connected by *edges*”

or sometimes

“*nodes* connected by *links*”

Let’s say that the graph has \(n\) nodes

We can build an \(n×n\) matrix \(A\) such that

\[A_{ij} = \begin{cases} 1\quad\text{if }i\text{ is connected to }j\\ 0\quad\text{otherwise} \end{cases}\]

This is called *Adjacency Matrix*

\[A= \begin{pmatrix} 0 & 1 & 0\\ 1 & 0 & 1\\ 0 & 1 & 0\\ \end{pmatrix} \]

The matrix elements are either 1 or 0

\[A= \begin{pmatrix} 0 & 1 & 0 & 0\\ 1 & 0 & 1 & 1\\ 0 & 1 & 0 & 1\\ 0 & 1 & 1 & 0\\ \end{pmatrix} \]

\[A= \begin{pmatrix} 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ 0 & 1 & 0 & 0\\ \end{pmatrix} \]

It is easy to see that the row sums are the number of out-neighbors \[\text{out–degree}(i) = \sum_j A_{ij}\]

Naturally, the column sums are the in-degrees \[\text{in–degree}(j) = \sum_i A_{ij}\]

If the matrix is symmetric, then row sums and column sums are the same \[\text{degree}(i) = \sum_j A_{ij}=\sum_j A_{ji}\]

When we have a graph, we can always make its adjacency matrix

So we can always calculate the degree of each node

The *degree distribution* is the frequency of each degree

```
A B C D
1 3 2 2
```

`[1] 0.00 0.25 0.50 0.25`

One way to determine the “important” node is to find the one with higher degree

These are called *hubs*

`[1] 3`

If \(A_{ij}=1\) then there is a direct way to go from \(i\) to \(j\)

If \(A_{ij}=0\) then we cannot go from \(i\) to \(j\) in one step

If instead we use an intermediate step \(k,\) and we do \(i→k→j,\) then there may be many ways to go from \(i\) to \(j\)

The number of two-step paths between \(i\) and \(j\) is \[\sum_k A_{ik}A_{kj}\]

We have shown that \(A^2\) represents the number of two-step paths between each pair of nodes

In other words, \((A^2)_{ij}≠0\) if there is at least one two-step path between \(i\) and \(j\)

In the same way, \(A^k\) represents the number of \(k\)-step paths between each pair of nodes

We can calculate the *distance* between any pair \((i,j)\) by looking the smallest \(k\) such that \((A^k)_{ij}≠0\)

If \((A^n)_{ij}=0\) then there is no path from \(i\) to \(j\)

Notice that \(A^k\) is not the distance

We build the distance matrix \(D\) looking at each of \(A^k\)

There are many efficient methods to calculate the distance between nodes

It is interesting to see the distribution of distances

Most of the graphs we study are too big to draw, so we need tools to understand them without looking at them

We will move around the graph choosing random neighbors

Our position is represented by a vector \(𝐮\) with \(n\) elements

\(𝐮_i=1\) if we are at the node \(i\)

We make a new matrix \(B\) dividing each column by its *out–degree* \[P_{ij} = \frac{A_{ij}}{\sum_k A_{kj}}\]

Here \(P_{ij}\) will represent the probability that we arrive to \(j\) if we were at \(i\)

We do not have any initial preference, so we start on any node with the same probability \[𝐮_i =1/n\qquad\text{ for all }i\]

After one step our position will be \(𝐯_{(1)}=P 𝐮\)

After \(k\) steps our position will be \(𝐯_{(k)}=P^k 𝐮\)

When \(k\) is large enough, then \(𝐯_{(k)}\) stabilizes

Let’s call \[𝐯_{(∞)}=\lim_{k->∞}P^k 𝐮\]

Each element \(i\) of the vector \(𝐯_{(∞)}\) represents the probability of being on node \(i\) in a random walk

This is called *eigen-centrality* and is another way to define the importance of each node