Package: textmineR 3.0.5.999

textmineR: Functions for Text Mining and Topic Modeling

An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.

Authors:Tommy Jones [aut, cre], William Doane [ctb], Mattias Attbom [ctb]

textmineR_3.0.5.999.tar.gz
textmineR_3.0.5.999.zip(r-4.5)textmineR_3.0.5.999.zip(r-4.4)textmineR_3.0.5.999.zip(r-4.3)
textmineR_3.0.5.999.tgz(r-4.4-x86_64)textmineR_3.0.5.999.tgz(r-4.4-arm64)textmineR_3.0.5.999.tgz(r-4.3-x86_64)textmineR_3.0.5.999.tgz(r-4.3-arm64)
textmineR_3.0.5.999.tar.gz(r-4.5-noble)textmineR_3.0.5.999.tar.gz(r-4.4-noble)
textmineR_3.0.5.999.tgz(r-4.4-emscripten)textmineR_3.0.5.999.tgz(r-4.3-emscripten)
textmineR.pdf |textmineR.html
textmineR/json (API)
NEWS

# Install 'textmineR' in R:
install.packages('textmineR', repos = c('https://tommyjones.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/tommyjones/textminer/issues

Pkgdown site:https://www.rtextminer.com

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:
  • nih_sample - Abstracts and metadata from NIH research grants awarded in 2014
  • nih_sample_dtm - Abstracts and metadata from NIH research grants awarded in 2014
  • nih_sample_topic_model - Abstracts and metadata from NIH research grants awarded in 2014

On CRAN:

cpp

10.81 score 106 stars 7 packages 310 scripts 3.1k downloads 5 mentions 33 exports 28 dependencies

Last updated 2 years agofrom:03b109d6e0. Checks:1 OK, 8 NOTE. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKJan 04 2025
R-4.5-win-x86_64NOTEJan 04 2025
R-4.5-linux-x86_64NOTEJan 04 2025
R-4.4-win-x86_64NOTEJan 04 2025
R-4.4-mac-x86_64NOTEJan 04 2025
R-4.4-mac-aarch64NOTEJan 04 2025
R-4.3-win-x86_64NOTEJan 04 2025
R-4.3-mac-x86_64NOTEJan 04 2025
R-4.3-mac-aarch64NOTEJan 04 2025

Exports:CalcGammaCalcHellingerDistCalcJSDivergenceCalcLikelihoodCalcLikelihoodCCalcProbCoherenceCalcSumSquaresCalcTopicModelR2Cluster2TopicModelCreateDtmCreateTcmdtm_to_lexicon_cDtm2DocsDtm2DocsCDtm2LexiconDtm2Tcmfit_lda_cFitCtmModelFitLdaModelFitLsaModelGetProbableTermsGetTopTermsHellinger_cppHellingerMatJSD_cppJSDmatLabelTopicsposteriorpredict_lda_cSummarizeTopicsTermDocFreqTmParallelApplyupdate

Dependencies:clidata.tabledigestfloatgluegtoolsISOcodeslatticelgrlifecyclemagrittrMatrixMatrixExtramlapiR6RcppRcppArmadilloRcppEigenRcppProgressRhpcBLASctlrlangrsparseRSpectrastopwordsstringistringrtext2vecvctrs

Start here

Rendered froma_start_here.Rmdusingknitr::rmarkdownon Jan 04 2025.

Last update: 2021-06-27
Started: 2018-02-10

document clustering

Rendered fromb_document_clustering.Rmdusingknitr::rmarkdownon Jan 04 2025.

Last update: 2021-06-27
Started: 2018-02-10

Topic modeling

Rendered fromc_topic_modeling.Rmdusingknitr::rmarkdownon Jan 04 2025.

Last update: 2022-05-11
Started: 2018-02-10

Text embeddings

Rendered fromd_text_embeddings.Rmdusingknitr::rmarkdownon Jan 04 2025.

Last update: 2021-06-27
Started: 2018-02-10

Document summarization

Rendered frome_doc_summarization.Rmdusingknitr::rmarkdownon Jan 04 2025.

Last update: 2019-01-04
Started: 2018-02-10

Using tidytext with textmineR

Rendered fromf_tidytext_example.Rmdusingknitr::rmarkdownon Jan 04 2025.

Last update: 2019-01-09
Started: 2018-12-23

Readme and manuals

Help Manual

Help pageTopics
Calculate a matrix whose rows represent P(topic_i|tokens)CalcGamma
Calculate Hellinger DistanceCalcHellingerDist
Calculate Jensen-Shannon DivergenceCalcJSDivergence
Calculate the log likelihood of a document term matrix given a topic modelCalcLikelihood
Probabilistic coherence of topicsCalcProbCoherence
Calculate the R-squared of a topic model.CalcTopicModelR2
Represent a document clustering as a topic modelCluster2TopicModel
Convert a character vector to a document term matrix.CreateDtm
Convert a character vector to a term co-occurrence matrix.CreateTcm
Convert a DTM to a Character Vector of documentsDtm2Docs
Turn a document term matrix into a list for LDA Gibbs samplingDtm2Lexicon
Turn a document term matrix into a term co-occurrence matrixDtm2Tcm
Fit a Correlated Topic ModelFitCtmModel
Fit a Latent Dirichlet Allocation topic modelFitLdaModel
Fit a topic model using Latent Semantic AnalysisFitLsaModel
Get cluster labels using a "more probable" method of termsGetProbableTerms
Get Top Terms for each topic from a topic modelGetTopTerms
Internal helper functions for 'textmineR'CalcLikelihoodC CalcSumSquares Dtm2DocsC dtm_to_lexicon_c fit_lda_c HellingerMat Hellinger_cpp JSDmat JSD_cpp predict_lda_c
Get some topic labels using a "more probable" method of termsLabelTopics
Abstracts and metadata from NIH research grants awarded in 2014nih nih_sample nih_sample_dtm nih_sample_topic_model
Posterior methods for topic modelsposterior
Draw from the posterior of an LDA topic modelposterior.lda_topic_model
Predict method for Correlated topic models (CTM)predict.ctm_topic_model
Get predictions from a Latent Dirichlet Allocation modelpredict.lda_topic_model
Predict method for LSA topic modelspredict.lsa_topic_model
Summarize topics in a topic modelSummarizeTopics
Get term frequencies and document frequencies from a document term matrix.TermDocFreq
textmineRtextmineR
An OS-independent parallel version of 'lapply'TmParallelApply
Update methods for topic modelsupdate
Update a Latent Dirichlet Allocation topic model with new dataupdate.lda_topic_model