Exploring the Russia Report using R

The UK’s Intelligence and Security Committee’s report into Russian activity in the UK was finally released a few days ago.

Here’s my exploration of redactions in the report, using R. Some highlights below.

One of the best predictors of whether a sentence will have a redaction is what organisations are mentioned in the sentence:

 

According to a sentiment analysis, the angriest sentences are on page 11 (PDF page 18):

 

Here’s a word cloud of sentences with a redaction, against the organisations(s) mentioned…

 

Choropleths in R – example using the 2020 Russian constitutional referendum

by with Comments Off on Choropleths in R – example using the 2020 Russian constitutional referendum

Choropleth maps use shading to represent quantities and are common in the press. I gave them a go in R, using the rvest package to scrape the results of the 2020 Russian constitutional referendum and the raster package piped through tidyverse tools to map them.

The code is on my GitHub repo.

Some of the fun I encountered along the way (details in the repo):

  • The CRAN version of raster didn’t work, but the latest on GitHub was fine and it’s easy to install this directly from R.
  • The Russian regions names in the raster map of Russia didn’t always match those on the Wikipedia article. I tried fuzzy matching by edit distance, which did a pretty good job but I still had to match some manually (e.g., “Sakha” and “Yakutia” are different names for the same place and a long edit distance from each other). I suspect it would have been easier just to sort both lists alphabetically and match manually from the start!
  • This warning is a worry: “support for gpclib will be withdrawn from maptools at the next major release” – I hope something comes along to replace it.
  • Lots of the examples of maps online are for the US and one basic problem is what projection to use. The mapproj package is fab for this.

Simple cluster bargraphs with error bars in R

Here you go. (Edited to add: These days you should use ggplot2.)

require(sciplot)
data(ToothGrowth)
bargraph.CI(dose, len, group = supp, data = ToothGrowth,
xlab = "Dose", ylab = "Growth", x.leg = 1,
col = c("white","grey"), legend = TRUE,
ci.fun = function(x) t.test(x)$conf.int)

lineplot.CI(dose, len, group = supp, data = ToothGrowth,
xlab = "Dose", ylab = "Growth", x.leg = 1,
legend = TRUE,
ci.fun = function(x) t.test(x)$conf.int)