This is the code used to produce the “schools of magic” dendrogram/heatmap plot that I posted on social media. As always, I’ll start by loading the packages:
library(dplyr)
library(tidyr)
library(readr)
library(tibble)
library(ggplot2)
library(stringr)
library(legendry)
The data wrangling for this one is slightly more elaborate than for the spell dice plot, because we’ll need data suitable for the heatmap and also we need data suitable to produce the dendrograms on each axis. We start by loading the spells
data:
spells <- read_csv("./data/spells.csv", show_col_types = FALSE)
To produce data for the heatmap, we select the relevant columns: i.e., those corresponding to the character classes, the school
variable that denotes the school of magic for the spell, and the name
variable because I like having an id column in my data. We then use pivot_longer()
to arrange this data set in long form:
spells_long <- spells |>
select(name, school, bard:wizard) |>
pivot_longer(
cols = bard:wizard,
names_to = "class",
values_to = "castable"
)
print(spells_long)
#> # A tibble: 2,512 × 4
#> name school class castable
#> <chr> <chr> <chr> <lgl>
#> 1 Acid Splash evocation bard FALSE
#> 2 Acid Splash evocation cleric FALSE
#> 3 Acid Splash evocation druid FALSE
#> 4 Acid Splash evocation paladin FALSE
#> 5 Acid Splash evocation ranger FALSE
#> 6 Acid Splash evocation sorcerer TRUE
#> 7 Acid Splash evocation warlock FALSE
#> 8 Acid Splash evocation wizard TRUE
#> 9 Aid abjuration bard TRUE
#> 10 Aid abjuration cleric TRUE
#> # ℹ 2,502 more rows
Now we have a tidy data set with one row per “observation”, in the sense that it specifies whether a spell of a specific name
(which belongs to a specific school
), is in fact castable
by members of a particular character class
. We can summarise this by aggregating over the specific spells, and count the number of castable spells for each combination of magic school and character class:
dat <- spells_long |>
summarise(
count = sum(castable),
.by = c("school", "class")
) |>
mutate(
school = str_to_title(school),
class = str_to_title(class)
)
print(dat)
#> # A tibble: 64 × 3
#> school class count
#> <chr> <chr> <int>
#> 1 Evocation Bard 7
#> 2 Evocation Cleric 12
#> 3 Evocation Druid 17
#> 4 Evocation Paladin 3
#> 5 Evocation Ranger 3
#> 6 Evocation Sorcerer 30
#> 7 Evocation Warlock 4
#> 8 Evocation Wizard 30
#> 9 Abjuration Bard 16
#> 10 Abjuration Cleric 33
#> # ℹ 54 more rows
This dat
data frame is suitable for plotting as a heat map with geom_tile()
, so let’s now move to stage two of the data wrangling.
The data structure that we need at this step is slightly more complicated, because what we want to display on each axis is a hierarchical clustering, of the sort typically produced by hclust()
. In a distant, distant past I actually wrote my PhD thesis on clustering and scaling tools used to represent item (dis)similarities, and as such I’m acutely aware that these tools are extremely sensitive to the way you define similarity (or dissimilarity, or distance, or association, or whatever…). So I’ll be a little careful here, because if you do this in a thoughtless way you get stupid answers.
Let’s start by reorganising the dat
data frame into a matrix form. The mat
matrix below contains the exact same information as the data frame: each cell in the matrix represents the number of castable spells for a specific combination of class and school.
print_truncated <- function(x) {
if (inherits(x, "matrix")) {
rownames(x) <- str_trunc(rownames(x), width = 6, ellipsis = ".")
colnames(x) <- str_trunc(colnames(x), width = 6, ellipsis = ".")
}
if (inherits(x, "dist")) {
attr(x, "Labels") <- str_trunc(
attr(x, "Labels"),
width = 6,
ellipsis = "."
)
}
print(round(x, digits = 3))
}
mat <- dat |>
pivot_wider(
names_from = "school",
values_from = "count"
) |>
as.data.frame()
rownames(mat) <- mat$class
mat$class <- NULL
mat <- as.matrix(mat)
print_truncated(mat)
#> Evoca. Abjur. Trans. Encha. Necro. Divin. Illus. Conju.
#> Bard 7 16 18 28 5 18 22 8
#> Cleric 12 33 13 8 14 17 1 11
#> Druid 17 17 33 9 7 14 2 21
#> Palad. 3 16 3 5 3 5 0 2
#> Ranger 3 11 13 3 1 9 1 7
#> Sorce. 30 7 33 13 9 8 14 19
#> Warlo. 4 8 6 12 10 9 11 9
#> Wizard 30 22 41 15 18 19 26 24
In this matrix we have a measure of “affinity”, in the sense that larger values indicate a higher affinity between a class and a school. The tricky part here is that some classes are simply better at spellwork than others: clerics and wizards can both cast lots of spells; paladins and rangers cannot cast many. The kind of similarity that I have in mind here is not the boring “clerics and wizards are similar because they can both cast lots of spells” kind. What I really want to say is something like “paladins and clerics are similar because abjuration is the strongest school for both classes”. The same applies when thinking about the schools of magic: there are lots of transmutation spells and lots of abjuration spells. That doesn’t really make those schools similar, not in the sense I care about.
What all this amounts to is an acknowledgement that we need to correct for overall prevalance, or – to frame it in probabilistic terms – to describe classes in terms of a distribution over schools and describe schools in terms of a distribution over classes. That gives us the following two matrices:
class_distro <- mat / replicate(ncol(mat), rowSums(mat))
school_distro <- t(mat) / (replicate(nrow(mat), colSums(mat)))
The class_distro
matrix is the one that describes classes as a distribution over schools, and you can see in the printout here that when described in this fashion the paladin row and the cleric row do look rather similar to each other:
print_truncated(class_distro)
#> Evoca. Abjur. Trans. Encha. Necro. Divin. Illus. Conju.
#> Bard 0.057 0.131 0.148 0.230 0.041 0.148 0.180 0.066
#> Cleric 0.110 0.303 0.119 0.073 0.128 0.156 0.009 0.101
#> Druid 0.142 0.142 0.275 0.075 0.058 0.117 0.017 0.175
#> Palad. 0.081 0.432 0.081 0.135 0.081 0.135 0.000 0.054
#> Ranger 0.062 0.229 0.271 0.062 0.021 0.188 0.021 0.146
#> Sorce. 0.226 0.053 0.248 0.098 0.068 0.060 0.105 0.143
#> Warlo. 0.058 0.116 0.087 0.174 0.145 0.130 0.159 0.130
#> Wizard 0.154 0.113 0.210 0.077 0.092 0.097 0.133 0.123
A similar phenomenon is observed in the school_distro
matrix, where you can see that the rows for abjuration and divination are quite similar despite the fact that there are a lot more abjuration spells than divination spells:
print_truncated(school_distro)
#> Bard Cleric Druid Palad. Ranger Sorce. Warlo. Wizard
#> Evoca. 0.066 0.113 0.160 0.028 0.028 0.283 0.038 0.283
#> Abjur. 0.123 0.254 0.131 0.123 0.085 0.054 0.062 0.169
#> Trans. 0.112 0.081 0.206 0.019 0.081 0.206 0.038 0.256
#> Encha. 0.301 0.086 0.097 0.054 0.032 0.140 0.129 0.161
#> Necro. 0.075 0.209 0.104 0.045 0.015 0.134 0.149 0.269
#> Divin. 0.182 0.172 0.141 0.051 0.091 0.081 0.091 0.192
#> Illus. 0.286 0.013 0.026 0.000 0.013 0.182 0.143 0.338
#> Conju. 0.079 0.109 0.208 0.020 0.069 0.188 0.089 0.238
We are now in a position to convert both of these to distance/distance matrices. Notwithstanding the fact that it’s probably not the ideal way to describe similarity between distributions, I’ll call dist()
using the default Euclidean distance measure. I mean, sure, I could probably do something fancy with Jensen-Shannon divergence here, but in my experience the metric you use to measure distributional similarity is far less important than the manner in which you construct the distributions from raw features in the first place, so I’m not going to sweat this one. Here’s our measure of class dissimilarity:
class_dissim <- dist(class_distro)
print_truncated(class_dissim)
#> Bard Cleric Druid Palad. Ranger Sorce. Warlo.
#> Cleric 0.309
#> Druid 0.296 0.251
#> Palad. 0.373 0.167 0.381
#> Ranger 0.294 0.213 0.146 0.313
#> Sorce. 0.286 0.342 0.168 0.468 0.292
#> Warlo. 0.151 0.270 0.288 0.371 0.312 0.279
#> Wizard 0.218 0.259 0.152 0.389 0.228 0.118 0.196
Here’s our measure of school dissimilarity:
school_dissim <- dist(school_distro)
print_truncated(school_dissim)
#> Evoca. Abjur. Trans. Encha. Necro. Divin. Illus.
#> Abjur. 0.320
#> Trans. 0.122 0.279
#> Encha. 0.323 0.284 0.270
#> Necro. 0.218 0.200 0.226 0.281
#> Divin. 0.271 0.133 0.203 0.181 0.179
#> Illus. 0.319 0.409 0.301 0.217 0.313 0.303
#> Conju. 0.134 0.251 0.073 0.273 0.178 0.184 0.319
After all that effort in constructing the dissimilarity matrices, the hierarchical clustering is something of an anticlimax. The only substantive choice we need to make here is whether to use single-link, complete-link, average-link, or some other method for agglomeration. This does matter somewhat, at least in my experience, but I’m also feeling lazy so I’m going to go with average-link because it feels appropriate to me in this context:
clusters <- list(
class = hclust(class_dissim, method = "average"),
school = hclust(school_dissim, method = "average")
)
print(clusters)
#> $class
#>
#> Call:
#> hclust(d = class_dissim, method = "average")
#>
#> Cluster method : average
#> Distance : euclidean
#> Number of objects: 8
#>
#>
#> $school
#>
#> Call:
#> hclust(d = school_dissim, method = "average")
#>
#> Cluster method : average
#> Distance : euclidean
#> Number of objects: 8
#>
#>
Constructing the plot can also be considered a two-part process. In the first stage, we constrict a base
plot object that uses geom_tile()
to display the class/school affinities data (i.e., dat
), and add various stylistic features to make it look pretty:
base <- ggplot(dat, aes(school, class, fill = count)) +
geom_tile() +
scale_fill_distiller(palette = "RdPu") +
labs(
x = "The Schools of Magic",
y = "The Classes of Character",
fill = "Number of Learnable Spells"
) +
coord_equal() +
theme(
plot.background = element_rect(
fill = "#222",
color = "#222"
),
plot.margin = unit(c(2, 2, 2, 2), units = "cm"),
text = element_text(color = "#ccc", size = 14),
axis.text = element_text(color = "#ccc"),
axis.title = element_text(color = "#ccc"),
axis.ticks = element_line(color = "#ccc"),
legend.position = "bottom",
legend.background = element_rect(
fill = "#222",
color = "#222"
)
)
plot(base)
In this form, though, you can’t really see which schools are similar to each other and nor can you see how the classes are related in terms of their spell-casting affinities. What we really want to do is reorder the rows and columns so that the most similar schools are placed in adjacent columns, and the most similar classes are placed in adjacent rows. Until recently I’d never found a tool for doing this in R that I found satisfying, but with the release of the legendry package by Teun van den Brand (which has a lot of tools for working with plot legends and axes that I’m slowly learning…) this has changed. If we pass a hierarchical clustering to the scale_*_dendro()
functions, the rows/columns are reordered appropriately, and the dendrograms themselves are shown alongside the axes:
pic <- base +
scale_x_dendro(
clust = clusters$school,
guide = guide_axis_dendro(n.dodge = 2),
expand = expansion(0, 0),
position = "top"
) +
scale_y_dendro(
clust = clusters$class,
expand = expansion(0, 0)
)
plot(pic)
So much nicer!
To any D&D player, the plot is immediately interpretable: wizards and sorcerers are very similar spellcasting classes, and the spellcasting abilities of paladins are basically “clerics, but not very good at it”. The same dynamic is in play with regards to druids and rangers, in the sense that they’re both nature focused spellcasters but rangers aren’t very good at it. The grouping of bards and warlocks surprised me a little, until it was pointed out to me that they both rely heavily on charisma in their spellcasting, so there is a kind of connection there.
On the schools side, the plot is similarly interpretable: enchantment and illusion are closely related schools, as are abjuration and divination. Necromancy feels a little bit like the darker cousin of abjuration so yeah, that tracks too. Transmutation, conjuration, and evocation are all kinda related, so you get a clustering there too.
There are some limitations to hierarchical clustering, of course, and you can see a little bit of that coming through in the plot. By design, I constructed the dissimilarities so that they’d ignore the “primary spellcaster vs secondary spellcaster” distinction, so the overall brightness of adjacent rows and columns varies wildly. But to capture that in a clustering solution while also capturing the “stylistic” similarities I’ve plotted here, you’d need to use an overlapping clustering tool rather than a hierarchical one, and those are inherently trickier to work with, and I wouldn’t be able to draw the pretty dendrograms either!