A digraph is a graph with directed arcs between nodes. The directed arcs are signified by arrowheads as shown in the graph below. In many practical applications, the need to indicate a direction becomes apparent.
In a digraph, a node's in-degree and out-degree is the count of the number of arcs that are coming into or out of the node respectively. So for example, in the above digraph, the in-degree of node A is 1, and its out-degree is 2. Similarly, for node D, the in-degree and out-degree are 2 and 0 respectively.
A direct link between nodes (for example from C to D or from C to A) is often referred to as a one-stage path. A two-stage path is a path between two nodes via some other node. For example, there is a two-stage path from C to D, via A, and there is a two-stage path from C to B via A, etc.
We can generalise this to n-stage paths. In the digraph above you will not find any 3-stage paths, but many other digraphs have them. In such instances, it is even possible that an n-stage path revisits the starting node. You might ask why would you ever wish to consider a path that revisits a node. Such a scenario might be important if you think about the digraph as a graph of viral contamination links.
In general, any digraph can have any number of n-stage pathways. For example, there might be two or more one-stage paths from C to D. We can also have a one-stage path from a particular node directly back to itself. This is often referred to as a digraph loop.
A digraph is a graph with directed arcs (arrowheads) between nodes.
A node's in-degree is the number of arcs coming into the node while its out-degree is the number of arcs coming out of the node.
An n-stage path is the link between nodes.
The information in a digraph can be summarised using a dominance matrix M.
Consider the following table showing the one-stage paths between every node and every other node for the graph above. The starting nodes are listed in the left-most column and the finishing nodes are listed in the first row. If a one-stage path exists between two nodes, we put a 1 in the corresponding element of the matrix, otherwise we put a 0.
A | B | C | D | |
---|---|---|---|---|
A | 0 | 1 | 0 | 1 |
B | 0 | 0 | 0 | 0 |
C | 1 | 1 | 0 | 1 |
D | 0 | 0 | 0 | 0 |
So, for example, there is a one-stage path from A to B, and A to D. There are no one-stage paths between B and any other node (including itself). There are 3 one-stage paths from C, one each to A,B, and D. Finally, there are no one-stage paths between D and any other node (including itself).
Provided the labelling convention is understood, we can represent the table as the matrix M given as:
M = \begin{bmatrix} 0 & 1 & 0 & 1\\ 0 & 0 & 0 & 0 \\ 1 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 \end {bmatrix}
Note that the sum of any row represents the out-degree of the corresponding node, and the sum of any column represents the in-degree of the corresponding node. For example, the out-degree of A is 2 and the out-degree of C is 3, and the in-degree of A is 1 and that of B is 2.
Dominance matrix can be used to summarise the information in a digraph putting the number of stage path between nodes as elements.
The sum of rows represents the out-degree while the sum of columns represents the in-degree of their corresponding nodes.
In any digraph, the matrix M^2 becomes a matrix of two-stage paths.
This is a remarkable result and the observation can be extended to the power matrix M^n which reveals the n-stage paths of the digraph.
Using a graphic calculator, we can easily derive:
M^2= \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 \\ \end{bmatrix}, M^3= \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ \end{bmatrix}
Hence, looking back at the digraph, we see that only two 2-stage paths exist (both starting from C), and, as stated earlier, no 3-stage paths exist at all.
If, in our example, the digraph represented the results of a tennis competition, where five games were played, then the five 1's in matrix M would represent these game results. The 1's in M would indicate either a loss or that no game was played.
Hence the row sums would show C as the tennis player having the most wins. Furthermore, the two-stage paths from C set up a sort of hierarchy of talent. C beat A, who in turn beat both B and D. So even if C had not played B or D, we can be reasonably confident that C is a better player than both of them.
It also shows that B and D are inseparable as players with the poorest results. The fact that they both have the same two-stage matrix column sum of 1, and the fact that M^3 is the zero matrix, means that we have no evidence, direct or otherwise, that one of them is a better player than the other.
In many digraphs however, examining the two-stage paths or higher might make it possible to rank these players. For example, two players may have won equal numbers of games, but, used as a tie-breaker, the two-stage path might show that one player is slightly more worthy of a ranking higher than the other.
To generate the matrix with two-stage or three-stage paths, we can take the power matrix M^2 or M^3 of matrix M using technology.
Generating the sum of rows of the dominance power matrices can be used to interpret the results of a game for example.
Suppose, in a certain round-robin tournament where everyone plays everyone else once, we summarise the results in a digraph and matrix M.
The 1-stage paths, expressed by M, can be thought of as direct information about rankings between any two players. For example, we might claim that A is better than C, because A beats C in a game.
The 2-stage paths, expressed by M^2, can be thought of as indirect ranking information. So for example we might say that A is most likely to be better than C because A beat B, and consequently B beat C.
The 3-stage paths, expressed by M^3, also contain indirect ranking information. A might possibly be better than C because A beat B, and B beat D who subsequently beat C.
Now one possible ranking strategy for the entire group of players might be to initially rank players according to the row sums of M. Then in an attempt to split tied results, check the row sums in the matrix sum M+M^2 of the tied players. If some players are still tied, check the row sums of those players with the matrix M+M^2+M^3. Continue in this manner to see if all ties can be eliminated.
Another alternative strategy is to form the rank by weighting the indirect information in some arbitrary way. For example, before the competition starts, alert all players that the ranking will be formed according to the row sums of the matrix R=M+0.5M^2+0.25M^3. The rationale for doing so, is that indirect information is not as conclusive as direct match results. The size of the weightings is again arbitrary.
Consider the network diagram below.
Create the dominance matrix, M, for this network.
Find the second stage dominance matrix M^2 using technology or otherwise.
To rank the players, find T=M+M^2 using technology or otherwise.
Determine the dominant node for the network according to T.
We can add the resulting dominant matrices to predict split tied results in competitions for example.
The dominant node has the highest sum rows of added dominance matrices.