Ch. 2 Graph Theory

This chapter largely introduces the basics of network science, and has a bestiary of metrics to describe a network. I’ll just list them here. All the networks are considered undirected.

# make a graph to check our answers for this whole part
n <- 10
p <- 0.5
g <- igraph::erdos.renyi.game(n = n, p.or.m = p, type = "gnp")
plot(g)

Let’s just list a lot of definitions here:

  1. Degree: \(L = \frac{1}{2}\sum_{i}K_{i}\) where \(L :=\) total number of edges in a network
  2. Average degree: \(\langle k \rangle = \frac{1}{N}\sum_{i}k_{i} = \frac{2L}{N}\), note that \(2L\) is twice the number of edges, because each edge contributes degree to two nodes, and divide by \(N\) for the average.
  3. Degree distribution: \(p_{k}\) is simply the probability to observe a node with degree \(k\) (where the node is randomly sampled from the network). Then we can write average degree as \(\langle k \rangle = \sum_{k}kp_{k}\).
  4. Adjacency matrix: \(A_{ij}\) will be \(1\) if an edge exists between \(i\to j\) and \(0\) otherwise. We can get node degree by summing over rows (or columns). Also note that \(2L = \sum_{ij}A_{ij}\).
  5. Maximum number of links: \(L_{\mathrm{max}} = {N\choose 2} = \frac{N(N-1)}{2}\) because each of the \(N\) nodes can have a maximum of \(N-1\) connections, and we divide by \(2\) because we are only counting edges.
  6. Shortest (geodesic) path: \(d_{ij}\) this is the path between \(i\to j\) that traverses the smallest number of edges. A path never contains loops or intersects itself.
  7. Powers of \(A\): raising \(A^d\) produces a matrix such that \(A^{d}_{ij}\) gives the number of length \(d\) paths from \(i\to j\). So \(d_{ij}\) is the smallest \(d\) such that \(A^{d}_{ij} > 0\).
  8. Diameter: \(d_{\mathrm{max}}\) is the maximum \(d_{ij}\) (longest shortest path).
  9. Average path length: \(\langle d \rangle = \frac{1}{N(N-1)}\sum_{ij,i\neq j}d_{ij}\).
  10. Clustering coefficient: the local clustering coefficient for node \(i\) is \(C_{i} = \frac{2L_{i}}{k_{i}(k_{i} - 1)}\), where \(L_{i}\) is the number of links between the neighbors of node \(i\). It can be interpreted as a probability of two random neighbors of \(i\) having a link between them.
  11. Average clustering coefficient: as you might guess, we can always average \(\langle C \rangle = \frac{1}{N}\sum_{i}C_{i}\). The average is now the probability that two neighbors of a randomly selected \(i\) have an edge between them.

Global clustering coefficient

Wikipedia as usual provides some explanation, and also some help at StackExchange.

The formula in the book is not well explained. The one from Newman’s book is:

\[ C = \frac{6 \times \text{number of triangles}}{\text{number of paths of length 2}} \] The numerator is simply the trace of \(A^{3}\), because each triangle is counted \(6\) times when doing the matrix power (for each of the 3 nodes in the triangle, one path for clockwise and counterclockwise).

A <- as.matrix(igraph::as_adj(g))
A2 <- A%^%2
diag(A2) <- 0

A3 <- A%^%3

sum(diag(A3)) / sum(A2)
## [1] 0.5
transitivity(graph = g, type = "global")
## [1] 0.5

Homework

1. Which of the icons in Figure 2.19 can be drawn without raising your pencil

from the paper, and without drawing any line more than once? Why?

As we know, any graph with more than 2 nodes containing an odd number of edges will not have an Eulerian path, so all but b can be drawn without raising the pencil.

2. Let \(A\) be the \(N \times N\) adjacency matrix of an undirected unweighted net-

work, without self-loops. Let \(\mathbf{1}\) be a column vector of \(N\) elements, all equal to 1. In other words \(\mathbf{1} = (1, 1, ..., 1)\), where the superscript \(T\) indicates the transpose operation. Use the matrix formalism (multiplicative constants, multiplication row by column, matrix operations like transpose and trace, etc, but avoid the sum symbol \(\sum\)) to write expressions for:

a. The vector \(\mathbf{k}\) whose elements are the degrees \(k_{i}\) of all nodes \(i = 1, 2,..., N\).

This is a simple matrix vector product \(A\mathbf{1} = \mathbf{k}\).

igraph::degree(g)
##  [1] 4 5 7 3 3 6 4 5 2 7
k <- as.vector(A %*% rep(1,n))
k
##  [1] 4 5 7 3 3 6 4 5 2 7
d. The vector \(\mathbf{k}_{nn}\) whose element \(i\) is the sum of the degrees of node \(i\)’s neighbors.

To get this, note we already know how to calculate the vector of degrees \(A\mathbf{1} = \mathbf{k}\). Now look at \(A\), the adjacency matrix, giving the degree-1 neighbors. If we multiply \(A\) by \(\mathbf{k}\) on the left, \(\mathbf{k}A\), what we are doing is getting a linear combination of the rows of \(A\). So each row tells us what nodes that node is incident to, and adds that node’s degree to their element of \(\mathbf{k}_{nn}\)!

Another way to think of it is: the \(i\)-th element of the resulting \(\mathbf{k}_{nn}\) will sum that column of \(A\) with weights given by \(\mathbf{k}\), that is, it gives us the sum of degrees of those nodes neighboring \(i\)!

# computationally, with igraph
vapply(X = igraph::ego(g, mindist = 1, order = 1), FUN = function(ki) {
    sum(igraph::degree(graph = g, v = as.vector(ki), loops = FALSE))
}, FUN.VALUE = numeric(1))
##  [1] 23 26 33 20 18 27 21 26 13 31
as.vector(k %*% A)
##  [1] 23 26 33 20 18 27 21 26 13 31
e. The vector \(\mathbf{k}_{nnn}\) whose element \(i\) is the sum of the degrees of node \(i\)’s second neighbors.

The second neighbors should be those nodes \(j\) for which \(A_{ij} = 0\) but \(A_{ij}^{2} = 1\), that is, nodes that can only be reached in paths of length \(2\). Let’s cheat a bit and use the distances function from igraph.

vapply(X = igraph::ego(g, mindist = 2, order = 2), FUN = function(ki) {
    sum(igraph::degree(graph = g, v = as.vector(ki), loops = FALSE))
}, FUN.VALUE = numeric(1))
##  [1] 19 15  6 23 25 13 21 15 31  8
as.vector(k %*% ifelse(distances(g) == 2, 1, 0))
##  [1] 19 15  6 23 25 13 21 15 31  8

4. Degree, Clustering Coefficient and Components

a. Consider an undirected network of size \(N\) in which each node

has degree \(k = 1\). Which condition does \(N\) have to satisfy? What is the degree distribution of this network? How many components does the network have?

2 nodes can only have 1 edge between them, and cannot connect to any other nodes, otherwise their degree would be \(>1\). So \(N\) must be even, otherwise there would be a node with \(k=0\) or a node with \(k>1\). It has \(N/2\) components.

It’s easy to see if we plot it:

plot(sample_k_regular(10,1))

b. Consider now a network in which each node has degree \(k = 2\)

and clustering coefficient \(C = 1\). How does the network look like? What condition does \(N\) satisfy in this case?

Let’s think logically about this. For all \(i\), we have \(k_{i} =2\). So if we only focus on the edges connected to \(i\) we get:

plot(graph.tree(n = 3,children = 2,"undir"))

The clustering coefficient is \(C_{i} = \frac{2L_{i}}{k_{i}(k_{i})-1} = \frac{2L_{i}}{2}\), so we know \(L_{i}=1\).

Thus the network must be made of disconnected triangles.

g <- sample_degseq(c(2,2,2), method = "simple.no.multiple")

transitivity(g)
## [1] 1
degree(g)
## [1] 2 2 2
plot(g)

Ch. 3 Random Networks

Erdős–Rényi random graph model

We use the shorthand \(G(N, p)\) for the model, indicating we specify \(N\) nodes, and the probability \(p\) that each of the possible \(N(N-1)/2\) pairs of nodes has an edge. Then we can work out the probability that a realization of \(G(N, p)\) has \(L\) edges:

\[ p_{L} = {N(N-1)/2 \choose L} p^{L} (1-p)^{N(N-1)/2 - L} \]

It’s a Binomial distribution with a sample space of \(L\) going from \(0, ..., N(N-1)/2\). Note that edges are independent. The mean \(\langle L \rangle = p N(N-1)/2\) is easy from the Binomial distribution.

We can also get the average degree \(\langle k \rangle\) from this too. Note that multiplying \(2\langle L \rangle\) gives us the average total degree of the network (because each edge contributes 2 endpoints). Divide by \(N\) to get the average degree:

\[ \langle k \rangle = \frac{2\langle L \rangle}{N} = p(N-1) \]

We can also arrive at average degree by noting that each node can make connections to up to \(N-1\) other nodes, each with probability \(p\), and that it’s degree is a Binomial random variable.

Small world

This is a big idea of this chapter. Basically small world networks are those whose average path length only grows like \(\log{N}\), where \(N\) is the number of nodes. It’s the case that the Erdős–Rényi random graph model (the most vanilla random graph) actually exhibits the small world phenomenon despite having a thin tailed (binomial) degree distribution.

I’ll try to go through their somewhat odd derivation here. First they say that on average, at distance \(d=1\) a node has \(\langle k \rangle^1\) nodes. And \(\langle k \rangle^d\) at distance \(d\). Let’s see how this holds up for a big network.

k <- 8
N <- 1e4
p <- k / (N-1)
g <- igraph::erdos.renyi.game(n = N, p.or.m = p, type = "gnp")

table(distances(g,v = 1))
## 
##    0    1    2    3    4    5    6    7  Inf 
##    1    7   43  379 2558 6108  895    7    2
k^(0:6)
## [1]      1      8     64    512   4096  32768 262144

Well I wouldn’t say this is a great approximation. It fails because it assumes that each node only connects to new nodes. But that can’t possibly be true because at the very least, even in a tree, it has to connect back to preceeding nodes in the “shell” of distances.

Okay then they say the expected number of nodes up to distance \(d\) away from the starting node is:

\[ N(d) = 1 + \langle k \rangle + \langle k \rangle^2 + ... = \frac{\langle k \rangle ^{d+1} - 1}{\langle k \rangle - 1} \] Okay. Sure. Now they say \(N(d) < N\), yes, I agree. But then they say the maximum distance (diameter) should be about:

\[ N(d_{\text{max}}) \approx N \] Hmm. Dunno how I feel about this. Now they just wantonly drop the \(-1\) since \(\langle k \rangle >> 1\) to get:

\[ N(d) = \frac{\langle k \rangle ^{d+1}}{\langle k \rangle} \approx \langle k \rangle^d \] Plug into the earlier equation.

\[ N(d_{\text{max}}) = \langle k \rangle^{d_{\text{max}}} = N \\ d_{\text{max}} = \log_{\langle k \rangle}N \\ d_{\text{max}} = \frac{\ln N}{\ln \langle k \rangle} \] I have serious reservations about this but whatever.

Oh, wait, I just realized this all makes “sense” if you let \(N\to\infty\). But they didn’t say that, did they?

Homework

4. Cayley trees

cayley.tree <- function(k, t) {
 if (t==0) return(graph.empty(1))
 if (t==1) return(graph.tree(k+1, k, mode="undir"))
 d <- k-1
 n1 <- (d^(t+1)-1) / (d-1)
 n2 <- (d^t -1)/ (d-1)
 g <- graph.tree(n1, d, mode="undir") %du% graph.tree(n2, d, mode="undir")
 g <- add.edges(g, c(1, n1 + 1))
 return(g)
}

k <- 3
t <- 5
g <- cayley.tree(k = k, t = t)
plot(g)

a. Calculate the number of nodes reachable in \(t\) steps from the central node.

To get the number of nodes in the outer “shell” \(t\) steps away we have:

\[ k (k-1)^{t-1} \]

This is because on each step after the first, only \(k-1\) of the links “fan out” towards new nodes, 1 link always has to connect back to the root, and after the first step, we step outwards \(t-1\) times.

For the total number of nodes at most \(t\) steps away, we get:

\[ \sum_{i=1}^{t} k (k-1)^{i-1} = k \sum_{i=1}^{t} (k-1)^{i-1} = k \left( \frac{(k-1)^t - 1}{(k-1) - 1} \right) \]

The sum simplified because it can be expanded and solved like a Geometric sum. In generality say it equals some \(s\):

\[ \sum_{n=1}^{N} x^{n-1} = s \\ s = 1 + x + x^2 + ... + x^{N-1} \\ xs = x + x^2 + ... + x^{N-1} + x^N \\ s - xs = 1 - x^N \\ s(1 - x) = 1 - x^N \\ s = \frac{1 - x^N}{1 - x} \\ s = \frac{1 - x^N}{1 - x} \frac{-1}{-1} \\ s = \frac{x^n - 1}{x - 1} \] Thus the total number of vertices of a Cayley tree of depth \(P\) with degree \(k\) (except for leaves) is:

\[ k \left( \frac{(k-1)^t - 1}{(k-1) - 1} \right) + 1 \] Or in code,

cayley_size <- function(k, t) {
  (k * ((((k - 1)^t) - 1) / ((k - 1) - 1))) + 1
}

# for the one in the text
cayley_size(k = k, t = t)
## [1] 94
igraph::vcount(g)
## [1] 94
b. Calculate the degree distribution of the network.

The outer shell has \(k (k-1)^{t-1}\) vertices with degree 1, and the other \(\left(k \left( \frac{(k-1)^t - 1}{(k-1) - 1} \right) + 1\right) - \left(k (k-1)^{t-1}\right)\) vertices have degree \(k\).

(k*((k-1)^(t-1))) / cayley_size(k = k, t = t)
## [1] 0.5106383
table(degree(g))/cayley_size(k,t)
## 
##         1         3 
## 0.5106383 0.4893617
c. Calculate the diameter \(d_{\mathrm{max}}\).
igraph::diameter(g)
## [1] 10
d. Find an expression for the diameter \(d_{\mathrm{max}}\) in terms of the total number of nodes \(N\).

I’m not sure why we’d do this in terms of nodes \(N\) rather than depth \(P\), because it’s just \(d_{\mathrm{max}} = 2P\), because the farthest nodes are the leaves, and to get from one leaf to another, we first go to the origin/root, and then out again to the other leaf.

We can do it in terms of \(N\). At distance \(d\) we have \(N(d) = k \left( \frac{(k-1)^d - 1}{(k-1) - 1} \right) + 1\) nodes.

For a depth \(P\), we will have \(d_{\mathrm{max}}/2 = P\) so plug in and solve for it:

\[ N = k \left( \frac{(k-1)^{d_{\mathrm{max}}/2} - 1}{(k-1) - 1} \right) + 1 \\ N - 1 = \frac{k}{k-2}\left( (k-1)^{d_{\mathrm{max}}/2} - 1 \right) \\ 1 + \frac{(N - 1)(k-2)}{k} = (k-1)^{d_{\mathrm{max}}/2} \\ \log{\left(1 + \frac{(N - 1)(k-2)}{k}\right)} = ({d_{\mathrm{max}}/2}) \log\left( k-1 \right) \\ d_{\mathrm{max}} = \frac{2}{\log(k-1)}\log{\left(1 + \frac{(N - 1)(k-2)}{k}\right)} \]

I don’t really know where to go from here, so let’s just assume \(N >> 1\), honestly this is the kind of stuff the authors did in the text without being clear about it so whatever.

\[ \frac{2}{\log(k-1)}\log{\left(1 + \frac{(N - 1)(k-2)}{k}\right)} \approx \frac{2}{\log(k-1)}\log{\left(1 + \frac{N(k-2)}{k}\right)} \]

Then simplify to:

\[ d_{\mathrm{max}} \approx \frac{2\log(N)}{\log(k-1)} \]

So it’s a small world after all.

Ch. 4 The Scale Free Property

The scale free property seems to be just any network whose degree distribution is sufficiently heavy tailed that some moments may not be finite. In this case, node degrees have have arbitrarily large fluctuations from the mean.

One big idea here is that we can characterize a network as scale free or not by how the expected size of the biggest hub changes with increasing \(N\). To do this, call \(k_{\text{max}}\) the degree of the largest node. Let’s approximate the degree distribution with a density function such that the probability of observe a node of that size or larger is \(1/N\):

\[ \int_{k_{\text{max}}}^{\infty} p(k)dk = 1/N \] Then we find \(k_{\text{max}}\) such that the equation holds. For light tailed degree distributions, such as Exponential and Poisson, \(k_{\text{max}}\) only grows like \(\log(N)\), so quite slowly. Scale free networks show a polynomial dependence, so size of the largest hub grows quickly with \(N\).

Homework

1.

a.