Wordfish is a Poisson scaling model of one-dimensional document positions (Slapin and Proksch 2008). Wordfish also allows for scaling documents, but compared to Wordscores reference scores/texts are not required. Wordfish is an unsupervised one-dimensional text scaling method, meaning that it estimates the positions of documents solely based on the observed word frequencies.
require(quanteda)
In this example, we show how to apply Wordfish to the Irish budget speeches from 2010. First, we create dfm
, afterwards we run Wordfish.
irish_dfm <- dfm(data_corpus_irishbudget2010, remove_punct = TRUE)
wf <- textmodel_wordfish(irish_dfm, dir = c(6,5))
summary(wf)
##
## Call:
## textmodel_wordfish.dfm(x = irish_dfm, dir = c(6, 5))
##
## Estimated Document Positions:
## theta se
## 2010_BUDGET_01_Brian_Lenihan_FF 1.79444 0.02010
## 2010_BUDGET_02_Richard_Bruton_FG -0.61774 0.02844
## 2010_BUDGET_03_Joan_Burton_LAB -1.14741 0.01561
## 2010_BUDGET_04_Arthur_Morgan_SF -0.08380 0.02900
## 2010_BUDGET_05_Brian_Cowen_FF 1.77416 0.02333
## 2010_BUDGET_06_Enda_Kenny_FG -0.75762 0.02642
## 2010_BUDGET_07_Kieran_ODonnell_FG -0.48645 0.04310
## 2010_BUDGET_08_Eamon_Gilmore_LAB -0.59455 0.02991
## 2010_BUDGET_09_Michael_Higgins_LAB -0.99302 0.04021
## 2010_BUDGET_10_Ruairi_Quinn_LAB -0.90657 0.04267
## 2010_BUDGET_11_John_Gormley_Green 1.18326 0.07235
## 2010_BUDGET_12_Eamon_Ryan_Green 0.17248 0.06336
## 2010_BUDGET_13_Ciaran_Cuffe_Green 0.72229 0.07269
## 2010_BUDGET_14_Caoimhghin_OCaolain_SF -0.05949 0.03873
##
## Estimated Feature Scores:
## when i presented the supplementary budget to this
## beta -0.1558 0.3217 0.3582 0.1945 1.077 0.03563 0.3097 0.249
## psi 1.6246 2.7253 -1.7925 5.3324 -1.128 2.71082 4.5208 3.462
## house last april said we could work our way
## beta 0.1461 0.2416 -0.1554 -0.8301 0.4193 -0.6067 0.5262 0.6918 0.2772
## psi 1.0407 0.9874 -0.5725 -0.4533 3.5140 1.0865 1.1164 2.5301 1.4208
## through period of severe economic distress today can report
## beta 0.6125 0.4989 0.2792 1.229 0.4245 1.804 0.0922 0.3057 0.6257
## psi 1.1636 -0.1747 4.4675 -2.007 1.5741 -4.457 0.8399 1.5663 -0.2466
## that notwithstanding difficulties past
## beta 0.0192 1.804 1.176 0.4777
## psi 3.8389 -4.457 -1.352 0.9339
We can plot the results of a fitted scaling model using textplot_scale1d()
.
# create nicer labels for speakers
doclab <- paste(docvars(irish_dfm, "name"), docvars(irish_dfm, "party"))
textplot_scale1d(wf, doclabels = doclab)
The function also allows to plot scores by a grouping variable, in this case the party affiliation of the speakers.
textplot_scale1d(wf, doclabels = doclab, groups = docvars(irish_dfm, "party"))
Finally, we can plot the estimated word positions and highlight certain features.
textplot_scale1d(wf, margin = "features",
highlighted = c("government", "global", "children",
"bank", "economy", "the", "citizenship",
"productivity", "deficit"))
If you want to learn more about Wordfish, see:
Slapin, Jonathan and Sven-Oliver Proksch. 2008. “A Scaling Model for Estimating Time-Series Party Positions from Texts.” American Journal of Political Science 52(3): 705-772.