require(quanteda)
require(readtext)
First, we will show you how to import pre-formatted files that come in a “spreadsheet format”. path_data
is the location of sample files on your computer that come with the readtext package.
path_data <- system.file("extdata/", package = "readtext")
If your text data is stored in a pre-formatted file where one column contains the text and additional columns might store document-level variables (e.g. year, author, or language), you can use read.csv()
to import.
dat_inaug <- read.csv(paste0(path_data, "/csv/inaugCorpus.csv"))
Alternatively, you can use the readtext package to import character (comma- or tab-separated) values. readtext reads files containing text, along with any associated document-level variables.
dat_dail <- readtext(paste0(path_data, "/tsv/dailsample.tsv"), text_field = "speech")
The most common problem related to loading data into R are misspecified locations of files or directories. If a path is relative, check where you are using getwd()
and set the root directory of your project using setwd()
. On Windows, you also have to replace all \
in a path with /
.
If you have more than a few R files in a project, you should create an RStudio Project to better manage files and settings. You can create an RStudio project from the menu (File > New Project).