Construct a mixing matrix for a graph, based on a specific vertex attribute

make_mixingmatrix(x, attrname, by_edge = FALSE, loops = has_loops(x))

Arguments

x

graph of class network or igraph

attrname

character, name of the attribute name

by_edge

logical, if TRUE a full mixing matrix is calculated

loops

logical, are loops allowed? By default, this is TRUE if the graph itself already has at least one loop.

Value

a table (if by_edge is FALSE) or a list with two tables (if by_edge is TRUE)

Details

A network mixing matrix is, traditionally, a two-dimensional cross-classification of edges by the values of a specific vertex attribute. This is an important tool for assessing network homophily or seggregation and often very useful for the subsequent construction of explanatory statistical models of the network.

Each cell ($i$, $j$) in the mixing matrix reports the number of edges in the graph where the sender has value $i$ on the vertex attribute and the receiver has value $j$ on that vertex attribute. In case of an undirected graph, each edge counts twice, when $j$ != $j$, since an undirected edge from MALE to FEMALE means that there is also an undirected edge from FEMALE to MALE.

If the argument by_edge is set to TRUE, a list of two mixing matrices is returned: the first contains the traditional mixing matrix and the second contains the mixing matrix of edges that do no occur in the graph. The two matrices are appropriately called "edge_present" and "no_edge_present".

The mixing matrix includes row and column margins. Note that this can be somewhat misleading when a mixing matrix is constructed for an undirected graph, as the off-diagonal entries of the mixing matrix will occur twice. Therefore, the overall sum of edges corrects for this and will hence not equal the grand total one would expect by somply adding up the row margins or the column margins. In fact, in case of an undirected graph,. the correct grand total should be equal to the sum of the elements in the upper (or lower) triangle + the summed diagonal.

The argument loops can be set to TRUE if edges from a vertex to itself should be included. The default is to only include loops if the graph already includes loops itself. Otherwise, it generally makes little sense.

The network package has a mixingmatrix function that works only on network objects, but has specific functionality for bipartite networks.

Examples

data(emon, package = "network")
is_directed(emon$LakePomona)   # TRUE
#> [1] TRUE
network::mixingmatrix(emon$LakePomona, "Sponsorship")
#>          To
#> From      City County Federal Private State Sum
#>   City       2      9       3       9    10  33
#>   County     3     10       4       9    13  39
#>   Federal    0      3       0       2     4   9
#>   Private    3     11       4       5    11  34
#>   State      2     13       4       7     7  33
#>   Sum       10     46      15      32    45 148
g <- emon$LakePomona
make_mixingmatrix(g, attrname = "Sponsorship")
#>          to
#> from      City County Federal Private State Sum
#>   City       2      9       3       9    10  33
#>   County     3     10       4       9    13  39
#>   Federal    0      3       0       2     4   9
#>   Private    3     11       4       5    11  34
#>   State      2     13       4       7     7  33
#>   Sum       10     46      15      32    45 148
make_mixingmatrix(g, attrname = "Sponsorship", by_edge = TRUE)
#> $edge_present
#>          to
#> from      City County Federal Private State Sum
#>   City       2      9       3       9    10  33
#>   County     3     10       4       9    13  39
#>   Federal    0      3       0       2     4   9
#>   Private    3     11       4       5    11  34
#>   State      2     13       4       7     7  33
#>   Sum       10     46      15      32    45 148
#> 
#> $no_edge_present
#>          to
#> from      City County Federal Private State Sum
#>   City      10     11       5      11     6  43
#>   County    17     10       6      16     7  56
#>   Federal    8      7       2       8     4  29
#>   Private   17     14       6      15     9  61
#>   State     14      7       4      13     5  43
#>   Sum       66     49      23      63    31 232
#> 
g <- snafun::to_igraph(emon$LakePomona)
make_mixingmatrix(g, attrname = "Sponsorship")
#>          to
#> from      City County Federal Private State Sum
#>   City       2      9       3       9    10  33
#>   County     3     10       4       9    13  39
#>   Federal    0      3       0       2     4   9
#>   Private    3     11       4       5    11  34
#>   State      2     13       4       7     7  33
#>   Sum       10     46      15      32    45 148
make_mixingmatrix(g, attrname = "Sponsorship", by_edge = TRUE)
#> $edge_present
#>          to
#> from      City County Federal Private State Sum
#>   City       2      9       3       9    10  33
#>   County     3     10       4       9    13  39
#>   Federal    0      3       0       2     4   9
#>   Private    3     11       4       5    11  34
#>   State      2     13       4       7     7  33
#>   Sum       10     46      15      32    45 148
#> 
#> $no_edge_present
#>          to
#> from      City County Federal Private State Sum
#>   City      10     11       5      11     6  43
#>   County    17     10       6      16     7  56
#>   Federal    8      7       2       8     4  29
#>   Private   17     14       6      15     9  61
#>   State     14      7       4      13     5  43
#>   Sum       66     49      23      63    31 232
#> 

data("judge_net", package = "snafun")
is_directed(judge_net)   # FALSE
#> [1] FALSE
make_mixingmatrix(judge_net, attrname = "color")
#> 
#> Note:  Marginal totals can be misleading for undirected mixing matrices.
#>               from
#> to             lightskyblue pink <NA> Sum
#>   lightskyblue           21   43    2  66
#>   pink                   43   25    3  71
#>   <NA>                    2    3    0   5
#>   Sum                    66   71    5  94
make_mixingmatrix(judge_net, attrname = "JudgeSex")
#> 
#> Note:  Marginal totals can be misleading for undirected mixing matrices.
#>       from
#> to      F  M <NA> Sum
#>   F    25 43    3  71
#>   M    43 21    2  66
#>   <NA>  3  2    0   5
#>   Sum  71 66    5  94
g <- suppressWarnings(snafun::to_network(judge_net))
make_mixingmatrix(g, attrname = "color")
#> 
#> Note:  Marginal totals can be misleading for undirected mixing matrices.
#>               from
#> to             lightskyblue pink <NA> Sum
#>   lightskyblue           21   43    2  66
#>   pink                   43   25    3  71
#>   <NA>                    2    3    0   5
#>   Sum                    66   71    5  94
make_mixingmatrix(g, attrname = "JudgeSex")
#> 
#> Note:  Marginal totals can be misleading for undirected mixing matrices.
#>       from
#> to      F  M <NA> Sum
#>   F    25 43    3  71
#>   M    43 21    2  66
#>   <NA>  3  2    0   5
#>   Sum  71 66    5  94
make_mixingmatrix(judge_net, attrname = "color", by_edge = TRUE)
#> 
#> Note:  Marginal totals can be misleading for undirected mixing matrices.
#> $edge_present
#>               from
#> to             lightskyblue pink <NA> Sum
#>   lightskyblue           21   43    2  66
#>   pink                   43   25    3  71
#>   <NA>                    2    3    0   5
#>   Sum                    66   71    5  94
#> 
#> $no_edge_present
#>               from
#> to             lightskyblue pink <NA> Sum
#>   lightskyblue           84  317   13 414
#>   pink                  317  251   21 589
#>   <NA>                   13   21    0  34
#>   Sum                   414  589   34 686
#> 
make_mixingmatrix(judge_net, attrname = "JudgeSex", by_edge = TRUE)
#> 
#> Note:  Marginal totals can be misleading for undirected mixing matrices.
#> $edge_present
#>       from
#> to      F  M <NA> Sum
#>   F    25 43    3  71
#>   M    43 21    2  66
#>   <NA>  3  2    0   5
#>   Sum  71 66    5  94
#> 
#> $no_edge_present
#>       from
#> to       F   M <NA> Sum
#>   F    251 317   21 589
#>   M    317  84   13 414
#>   <NA>  21  13    0  34
#>   Sum  589 414   34 686
#>