Let’s don’t rush into the coding part. Firstly, take a look at the network analysis.
1. What is a graph?
In graph theory, a graph is a representation of a set of objects where some pairs of objects are connected by links.It is a set of entities (aka vertices or nodes) and the connections (aka edges, links, or ties) among them. There may be an edge between every pair of nodes, no edges at all, or anything in between. You could refer to wikipedia of graph or this lecture note.
2. Network analysis: Centrality
Let’s Assume a graph of social web connections. People are nodes in the graph and they connect to each other by relationships. Nodes have weights, as people are influential to others. Therefore, we are concerned about the importance of nodes. There are several ways to calculate the importance of nodes, for example, Degree Centrality, Eigenvector Centrality, Katz Centrality, PageRank, Closeness Centrality, Betweenness Centrality, Transitivity. In the project, we mainly use degree centrality, eigenvector centrality and betweenness centrality to represent the importance of nodes. Degree centrality focuses on individual nodes – it simply counts the number of edges that a node has. High betweenness centrality nodes usually are those that occur on many shortest paths between other nodes in a graph. The eigenvector centrality of a node is proportional to the sum of the centrality scores of its neighbors. For more informations, you could refer to this lecture note concerning social and technological network analysis from University of Cambridge or wikipedia explaining the concept of centrality.
As we are more concerned about eigenvector centrality. Now let’s give a look at its algorithm.(The following comes mainly from wikipedia)
An eigenvector of a square matrix is a non-zero vector that, when the matrix multiples , yields a constant multiple of , the latter multiplier being commonly denoted by . That is:(for more informations, please look at wiki or this note)
Eigenvector centrality is a measure of the influence of a node in a network. It assigns relative scores to all nodes in the network based on the concept that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes. We could use eigenvector adjacent matrix to find eigenvector centrality. For a given graph with number of vertices let be the adjacency matrix, i.e. if vertex is linked to vertex , and otherwise. The centrality score of vertex can be defined as:
where is a set of the neighbors of and is a constant. With a small rearrangement this can be rewritten in vector notation as the eigenvector equation
In general, there will be many different eigenvalues for which an eigenvector solution exists. However, the additional requirement that all the entries in the eigenvector be positive implies (by the Perron–Frobenius theorem) that only the greatest eigenvalue results in the desired centrality measure. The component of the related eigenvector then gives the centrality score of the vertex in the network. To compute, we use this algorithm:
- Calculate the eigendecomposition of the pairwise adjacency matrix of the graph.
- Select the eigenvector associated with largest eigenvalue.
- Element i in the eigenvector gives the centrality of the i-th node.
In my other blog posts, we will look at its applications in graph visualization, specially, simple graph visualizations on
brain visualizations on