Each term is a unigram, i.e. a word, except for common phrases. Stop words and rare words were removed.

lcm_dfm2

Format

A `quanteda` `dfm` object

Source

See the `lcm_text_mining.Rmd` file in the `data-raw` in the [GitHub repo of this package](https://www.github.com/pachterlab/museumst) for how this matrix was generated, including what the phrases and stopwords are and what counts as a rare term.