Package: textmineR 3.0.6.999

textmineR: Functions for Text Mining and Topic Modeling

An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.

Authors:Tommy Jones [aut, cre], William Doane [ctb], Mattias Attbom [ctb]

textmineR_3.0.6.999.tar.gz
textmineR_3.0.6.999.zip(r-4.7)textmineR_3.0.6.999.zip(r-4.6)textmineR_3.0.6.999.zip(r-4.5)
textmineR_3.0.6.999.tgz(r-4.6-x86_64)textmineR_3.0.6.999.tgz(r-4.6-arm64)textmineR_3.0.6.999.tgz(r-4.5-x86_64)textmineR_3.0.6.999.tgz(r-4.5-arm64)
textmineR_3.0.6.999.tar.gz(r-4.7-arm64)textmineR_3.0.6.999.tar.gz(r-4.7-x86_64)textmineR_3.0.6.999.tar.gz(r-4.6-arm64)textmineR_3.0.6.999.tar.gz(r-4.6-x86_64)
textmineR_3.0.6.999.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
textmineR/json (API)

# Install 'textmineR' in R:
install.packages('textmineR', repos = c('https://tommyjones.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/tommyjones/textminer/issues

Pkgdown/docs site:https://www.rtextminer.com

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:
  • nih_sample - Abstracts and metadata from NIH research grants awarded in 2014
  • nih_sample_dtm - Abstracts and metadata from NIH research grants awarded in 2014
  • nih_sample_topic_model - Abstracts and metadata from NIH research grants awarded in 2014

On CRAN:

Conda:

cpp

11.34 score 107 stars 7 packages 472 scripts 1.5k downloads 5 mentions 33 exports 28 dependencies

Last updated from:f4ae837460. Checks:13 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-arm64OK257
linux-devel-x86_64OK272
source / vignettesOK422
linux-release-arm64OK253
linux-release-x86_64OK220
macos-release-arm64OK167
macos-release-x86_64OK298
macos-oldrel-arm64OK149
macos-oldrel-x86_64OK265
windows-develOK243
windows-releaseOK224
windows-oldrelOK224
wasm-releaseOK193

Exports:CalcGammaCalcHellingerDistCalcJSDivergenceCalcLikelihoodCalcLikelihoodCCalcProbCoherenceCalcSumSquaresCalcTopicModelR2Cluster2TopicModelCreateDtmCreateTcmdtm_to_lexicon_cDtm2DocsDtm2DocsCDtm2LexiconDtm2Tcmfit_lda_cFitCtmModelFitLdaModelFitLsaModelGetProbableTermsGetTopTermsHellinger_cppHellingerMatJSD_cppJSDmatLabelTopicsposteriorpredict_lda_cSummarizeTopicsTermDocFreqTmParallelApplyupdate

Dependencies:clidata.tabledigestfloatgluegtoolsISOcodeslatticelgrlifecyclemagrittrMatrixMatrixExtramlapiR6RcppRcppArmadilloRcppEigenRcppProgressRhpcBLASctlrlangrsparseRSpectrastopwordsstringistringrtext2vecvctrs

Start here
Why textmineR? | Corpus management | Creating a DTM | Basic corpus statistics

Last update: 2025-10-17
Started: 2018-02-10

document clustering
Document clustering

Last update: 2025-10-17
Started: 2018-02-10

Topic modeling
Topic modeling | LDA Example | LSA Example | Other topic models | Extensions | Document clustering is just a special topic model | Choosing the number of topics | Using topic models from other packages

Last update: 2025-10-17
Started: 2018-02-10

Text embeddings
Text embeddings | Create a term co-occurrence matrix | Fitting a model | Interpretation of $\Phi$ and $\Theta$ | Evaluating the model | Embedding documents under the model | Where to next?

Last update: 2025-10-17
Started: 2018-02-10

Document summarization
Getting started | Building a basic document summarizer | Pulling it all together

Last update: 2025-10-17
Started: 2018-02-10

Using tidytext with textmineR
Using tidytext with textmineR

Last update: 2025-09-23
Started: 2018-12-23

Readme and manuals

Help Manual

Help pageTopics
Calculate a matrix whose rows represent P(topic_i|tokens)CalcGamma
Calculate Hellinger DistanceCalcHellingerDist
Calculate Jensen-Shannon DivergenceCalcJSDivergence
Calculate the log likelihood of a document term matrix given a topic modelCalcLikelihood
Probabilistic coherence of topicsCalcProbCoherence
Calculate the R-squared of a topic model.CalcTopicModelR2
Represent a document clustering as a topic modelCluster2TopicModel
Convert a character vector to a document term matrix.CreateDtm
Convert a character vector to a term co-occurrence matrix.CreateTcm
Convert a DTM to a Character Vector of documentsDtm2Docs
Turn a document term matrix into a list for LDA Gibbs samplingDtm2Lexicon
Turn a document term matrix into a term co-occurrence matrixDtm2Tcm
Fit a Correlated Topic ModelFitCtmModel
Fit a Latent Dirichlet Allocation topic modelFitLdaModel
Fit a topic model using Latent Semantic AnalysisFitLsaModel
Get cluster labels using a "more probable" method of termsGetProbableTerms
Get Top Terms for each topic from a topic modelGetTopTerms
Internal helper functions for 'textmineR'CalcLikelihoodC CalcSumSquares Dtm2DocsC dtm_to_lexicon_c fit_lda_c HellingerMat Hellinger_cpp JSDmat JSD_cpp predict_lda_c
Get some topic labels using a "more probable" method of termsLabelTopics
Abstracts and metadata from NIH research grants awarded in 2014nih nih_sample nih_sample_dtm nih_sample_topic_model
Posterior methods for topic modelsposterior
Draw from the posterior of an LDA topic modelposterior.lda_topic_model
Predict method for Correlated topic models (CTM)predict.ctm_topic_model
Get predictions from a Latent Dirichlet Allocation modelpredict.lda_topic_model
Predict method for LSA topic modelspredict.lsa_topic_model
Summarize topics in a topic modelSummarizeTopics
Get term frequencies and document frequencies from a document term matrix.TermDocFreq
textmineRtextmineR-package textmineR
An OS-independent parallel version of 'lapply'TmParallelApply
Update methods for topic modelsupdate
Update a Latent Dirichlet Allocation topic model with new dataupdate.lda_topic_model