“Vertices connected by edges”
or sometimes
“nodes connected by links”
Let’s say that the graph has \(n\) nodes
We can build an \(n×n\) matrix \(A\) such that
\[A_{ij} = \begin{cases} 1\quad\text{if }i\text{ is connected to }j\\ 0\quad\text{otherwise} \end{cases}\]
This is called Adjacency Matrix
\[A= \begin{pmatrix} 0 & 1 & 0\\ 1 & 0 & 1\\ 0 & 1 & 0\\ \end{pmatrix} \]
The matrix elements are either 1 or 0
\[A= \begin{pmatrix} 0 & 1 & 0 & 0\\ 1 & 0 & 1 & 1\\ 0 & 1 & 0 & 1\\ 0 & 1 & 1 & 0\\ \end{pmatrix} \]
\[A= \begin{pmatrix} 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ 0 & 1 & 0 & 0\\ \end{pmatrix} \]
It is easy to see that the row sums are the number of out-neighbors \[\text{out–degree}(i) = \sum_j A_{ij}\]
Naturally, the column sums are the in-degrees \[\text{in–degree}(j) = \sum_i A_{ij}\]
If the matrix is symmetric, then row sums and column sums are the same \[\text{degree}(i) = \sum_j A_{ij}=\sum_j A_{ji}\]
When we have a graph, we can always make its adjacency matrix
So we can always calculate the degree of each node
The degree distribution is the frequency of each degree
A B C D
1 3 2 2
[1] 0.00 0.25 0.50 0.25
One way to determine the “important” node is to find the one with higher degree
These are called hubs
[1] 3
If \(A_{ij}=1\) then there is a direct way to go from \(i\) to \(j\)
If \(A_{ij}=0\) then we cannot go from \(i\) to \(j\) in one step
If instead we use an intermediate step \(k,\) and we do \(i→k→j,\) then there may be many ways to go from \(i\) to \(j\)
The number of two-step paths between \(i\) and \(j\) is \[\sum_k A_{ik}A_{kj}\]
We have shown that \(A^2\) represents the number of two-step paths between each pair of nodes
In other words, \((A^2)_{ij}≠0\) if there is at least one two-step path between \(i\) and \(j\)
In the same way, \(A^k\) represents the number of \(k\)-step paths between each pair of nodes
We can calculate the distance between any pair \((i,j)\) by looking the smallest \(k\) such that \((A^k)_{ij}≠0\)
If \((A^n)_{ij}=0\) then there is no path from \(i\) to \(j\)
Notice that \(A^k\) is not the distance
We build the distance matrix \(D\) looking at each of \(A^k\)
There are many efficient methods to calculate the distance between nodes
It is interesting to see the distribution of distances
Most of the graphs we study are too big to draw, so we need tools to understand them without looking at them
We will move around the graph choosing random neighbors
Our position is represented by a vector \(𝐮\) with \(n\) elements
\(𝐮_i=1\) if we are at the node \(i\)
We make a new matrix \(B\) dividing each column by its out–degree \[P_{ij} = \frac{A_{ij}}{\sum_k A_{kj}}\]
Here \(P_{ij}\) will represent the probability that we arrive to \(j\) if we were at \(i\)
We do not have any initial preference, so we start on any node with the same probability \[𝐮_i =1/n\qquad\text{ for all }i\]
After one step our position will be \(𝐯_{(1)}=P 𝐮\)
After \(k\) steps our position will be \(𝐯_{(k)}=P^k 𝐮\)
When \(k\) is large enough, then \(𝐯_{(k)}\) stabilizes
Let’s call \[𝐯_{(∞)}=\lim_{k->∞}P^k 𝐮\]
Each element \(i\) of the vector \(𝐯_{(∞)}\) represents the probability of being on node \(i\) in a random walk
This is called eigen-centrality and is another way to define the importance of each node