Construct a mixing matrix for a graph, based on a specific vertex attribute
make_mixingmatrix(x, attrname, by_edge = FALSE, loops = has_loops(x))
a table (if by_edge
is FALSE
) or a list with two
tables (if by_edge
is TRUE
)
A network mixing matrix is, traditionally, a two-dimensional cross-classification of edges by the values of a specific vertex attribute. This is an important tool for assessing network homophily or seggregation and often very useful for the subsequent construction of explanatory statistical models of the network.
Each cell ($i$, $j$) in the mixing matrix reports the number of edges in the graph where the sender has value $i$ on the vertex attribute and the receiver has value $j$ on that vertex attribute. In case of an undirected graph, each edge counts twice, when $j$ != $j$, since an undirected edge from MALE to FEMALE means that there is also an undirected edge from FEMALE to MALE.
If the argument by_edge
is set to TRUE
, a list of two mixing
matrices is returned: the first contains the traditional mixing matrix and the
second contains the mixing matrix of edges that do no occur in the graph.
The two matrices are appropriately called "edge_present" and "no_edge_present".
The mixing matrix includes row and column margins. Note that this can be somewhat misleading when a mixing matrix is constructed for an undirected graph, as the off-diagonal entries of the mixing matrix will occur twice. Therefore, the overall sum of edges corrects for this and will hence not equal the grand total one would expect by somply adding up the row margins or the column margins. In fact, in case of an undirected graph,. the correct grand total should be equal to the sum of the elements in the upper (or lower) triangle + the summed diagonal.
The argument loops
can be set to TRUE
if edges from a vertex
to itself should be included. The default is to only include loops if the
graph already includes loops itself. Otherwise, it generally makes little sense.
The network
package has a mixingmatrix
function that works only on network
objects, but has specific functionality for bipartite networks.
data(emon, package = "network")
is_directed(emon$LakePomona) # TRUE
#> [1] TRUE
network::mixingmatrix(emon$LakePomona, "Sponsorship")
#> To
#> From City County Federal Private State Sum
#> City 2 9 3 9 10 33
#> County 3 10 4 9 13 39
#> Federal 0 3 0 2 4 9
#> Private 3 11 4 5 11 34
#> State 2 13 4 7 7 33
#> Sum 10 46 15 32 45 148
g <- emon$LakePomona
make_mixingmatrix(g, attrname = "Sponsorship")
#> to
#> from City County Federal Private State Sum
#> City 2 9 3 9 10 33
#> County 3 10 4 9 13 39
#> Federal 0 3 0 2 4 9
#> Private 3 11 4 5 11 34
#> State 2 13 4 7 7 33
#> Sum 10 46 15 32 45 148
make_mixingmatrix(g, attrname = "Sponsorship", by_edge = TRUE)
#> $edge_present
#> to
#> from City County Federal Private State Sum
#> City 2 9 3 9 10 33
#> County 3 10 4 9 13 39
#> Federal 0 3 0 2 4 9
#> Private 3 11 4 5 11 34
#> State 2 13 4 7 7 33
#> Sum 10 46 15 32 45 148
#>
#> $no_edge_present
#> to
#> from City County Federal Private State Sum
#> City 10 11 5 11 6 43
#> County 17 10 6 16 7 56
#> Federal 8 7 2 8 4 29
#> Private 17 14 6 15 9 61
#> State 14 7 4 13 5 43
#> Sum 66 49 23 63 31 232
#>
g <- snafun::to_igraph(emon$LakePomona)
make_mixingmatrix(g, attrname = "Sponsorship")
#> to
#> from City County Federal Private State Sum
#> City 2 9 3 9 10 33
#> County 3 10 4 9 13 39
#> Federal 0 3 0 2 4 9
#> Private 3 11 4 5 11 34
#> State 2 13 4 7 7 33
#> Sum 10 46 15 32 45 148
make_mixingmatrix(g, attrname = "Sponsorship", by_edge = TRUE)
#> $edge_present
#> to
#> from City County Federal Private State Sum
#> City 2 9 3 9 10 33
#> County 3 10 4 9 13 39
#> Federal 0 3 0 2 4 9
#> Private 3 11 4 5 11 34
#> State 2 13 4 7 7 33
#> Sum 10 46 15 32 45 148
#>
#> $no_edge_present
#> to
#> from City County Federal Private State Sum
#> City 10 11 5 11 6 43
#> County 17 10 6 16 7 56
#> Federal 8 7 2 8 4 29
#> Private 17 14 6 15 9 61
#> State 14 7 4 13 5 43
#> Sum 66 49 23 63 31 232
#>
data("judge_net", package = "snafun")
is_directed(judge_net) # FALSE
#> [1] FALSE
make_mixingmatrix(judge_net, attrname = "color")
#>
#> Note: Marginal totals can be misleading for undirected mixing matrices.
#> from
#> to lightskyblue pink <NA> Sum
#> lightskyblue 21 43 2 66
#> pink 43 25 3 71
#> <NA> 2 3 0 5
#> Sum 66 71 5 94
make_mixingmatrix(judge_net, attrname = "JudgeSex")
#>
#> Note: Marginal totals can be misleading for undirected mixing matrices.
#> from
#> to F M <NA> Sum
#> F 25 43 3 71
#> M 43 21 2 66
#> <NA> 3 2 0 5
#> Sum 71 66 5 94
g <- suppressWarnings(snafun::to_network(judge_net))
make_mixingmatrix(g, attrname = "color")
#>
#> Note: Marginal totals can be misleading for undirected mixing matrices.
#> from
#> to lightskyblue pink <NA> Sum
#> lightskyblue 21 43 2 66
#> pink 43 25 3 71
#> <NA> 2 3 0 5
#> Sum 66 71 5 94
make_mixingmatrix(g, attrname = "JudgeSex")
#>
#> Note: Marginal totals can be misleading for undirected mixing matrices.
#> from
#> to F M <NA> Sum
#> F 25 43 3 71
#> M 43 21 2 66
#> <NA> 3 2 0 5
#> Sum 71 66 5 94
make_mixingmatrix(judge_net, attrname = "color", by_edge = TRUE)
#>
#> Note: Marginal totals can be misleading for undirected mixing matrices.
#> $edge_present
#> from
#> to lightskyblue pink <NA> Sum
#> lightskyblue 21 43 2 66
#> pink 43 25 3 71
#> <NA> 2 3 0 5
#> Sum 66 71 5 94
#>
#> $no_edge_present
#> from
#> to lightskyblue pink <NA> Sum
#> lightskyblue 84 317 13 414
#> pink 317 251 21 589
#> <NA> 13 21 0 34
#> Sum 414 589 34 686
#>
make_mixingmatrix(judge_net, attrname = "JudgeSex", by_edge = TRUE)
#>
#> Note: Marginal totals can be misleading for undirected mixing matrices.
#> $edge_present
#> from
#> to F M <NA> Sum
#> F 25 43 3 71
#> M 43 21 2 66
#> <NA> 3 2 0 5
#> Sum 71 66 5 94
#>
#> $no_edge_present
#> from
#> to F M <NA> Sum
#> F 251 317 21 589
#> M 317 84 13 414
#> <NA> 21 13 0 34
#> Sum 589 414 34 686
#>