Install packages


quanteda runs solely on base R, but RStudio makes it easy to write your code and inspect your objects. You will need to have base R installed, and we also recommend to install the latest version of RStudio.


First, you need to have quanteda installed. You can do this from inside RStudio, from the Tools > Install Packages, or executing a command.


If you are feeling adventurous, you can install the latest build of quanteda from its GitHub code page.

Note that on Windows platforms, it is also recommended that you install the RTools suite, and for OS X that you install XCode from the App Store.

If you use the quanteda package in your reserach, please cite:
Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, and Akitaka Matsuo. 2018 “quanteda: An R package for the quantitative analysis of textual data.” Journal of Open Source Software 3(30), 774.

Other packages

We will use the readtext package to read in different types of text data in this tutorials. Again, you can do this using RStudio menu (Tools > Install Packages), or executing the following command.


We will also use extra datasets in tutorials that are available in quanteda.corpora.


If you already have quanteda and other packages installed, run Tools > Check for Package Updates to install the latest versions. We recommmend to update all the packages using update.packages() to avoid errors caused by dependencies.

Extra packages

The tutorials do not cover syntactical analysis, but you should install spacyr for part-of-speech tagging, entity recognition, and dependency parsing. It provides an interface to the spaCy library and works well with quanteda. Note that you need to have Python installed to use the spacyr package. See the package description for more information.


Finally, we show how to use newsmap to classify documents based on “seed words” in dictionaries. You can download the pacakge from CRAN.


To sum up, you need to load the following packages to run all examples:


With quanteda_options() you can specify get or set global options affecting functions across quanteda. One very useful feature is changing the number of threads to use in parralelised functions. By default, quanteda uses two threads, but depending on the RAM of your machine, you can use more than two threads.

For instance, quanteda_options("threads" = 10) will use ten threads which massively reduces the time to execute the parralelised functions.