textTopicsWordCloud().top_frequent = NULL and
ngram_select = "estimate"Performance & Robustness * topicsGrams() speed-up:
Rebuilt the n-gram and per-document frequency computation using a single
sparse-matrix pass with quanteda, replacing the slow
per-n-gram regex counting loop (major runtime improvement on
medium/large datasets). * Memory-safe output: freq_per_user
now avoids accidental sparse → dense coercion (the “allocating GiB”
warning). It supports auto wide/long output, returning long format when
wide would be too large.
Harmonization with topicsDtm() * Aligned settings &
reproducibility: topicsGrams() now mirrors
topicsDtm() preprocessing controls (e.g., lower,
punctuation/numbers removal, removalword, shuffle, seed, threads,
optional stemming/lemmatization hook) and returns a saved settings list
in the output.
topicsTutorialData(): New utility
function to download and prepare long-text essay data directly from
Hugging Face. Supports custom sample_size,
min_word_count, max_word_count, and
seed.topicsPlotOverview(): Introduced a
high-level plotting function for structured overviews. Supports
side-by-side comparisons (ngrams), 1D layouts, and 2-D 3x3 grids with a
central distribution plot.topicsTest()x_variable and y_variable now fully support
Factors and Character vectors.test_method is now assigned per-variable. The package
automatically detects binary data (0/1 or 2-level factors) to apply
logistic_regression while using
linear_regression for continuous data.logistic_level string in the output list to
clarify the Baseline (0) vs. Target (1) mapping.topicsPreds() can
now be accessed via descriptive aliases:
topicsPredict()topicsAssess()topicsClassify()topicsPlot() for better aesthetic consistency.text-package.topicsGrams() now uses exact word boundary matching for
n-grams (e.g., “lack” is matched as a standalone word, excluding partial
matches like “black” or “lacking”).topicsTest().creat_plot help
function.rJava to suggest to enable compatibility with
the text-package.scatter_legend_dots_alpha and
scatter_legend_bg_dots_alpha parameters for the
topicsPlot() function.logistic_regression.occurance_rate to topicsGrams()removal_mode, removal_rate_most and
removal_rate_least to topicsGrams()ngram_window = c(1) now supported by
topicsDtm()topicsPlot() with ngramssize in the dot legend will be based on
prevalence if scatter_legend_dot_size = “prevalence”. And
the popouts are not transparent.generate_scatter_plot.highlight_topic_words is set to
NULL in the topicsPlot() function.topicsGrams(), including
removing top_n and treating n-grams type differently.stopwords function to
topicsGrams().pmi calculation.ngrams_max parameter in
`topicsPlot()```.allowed_word_overlap in
topicsPlot() for plotting the most prevalence.highlight_topic_words parameter to add different
colours for a word list.stopwords removal for
topicsGram().ngrams_max functionality to
topicsPlot().save_dir and load_dir from all
function; only topicsPlot() now has the
save_dir as an option.prevalence.p_adjust_method to
topicsPlots().scatter_show_axis_values to the
topcisPlot().n_most_prevalent_topics.default to linear_regression if not the
variable only contains 0s and 1s; i.e., now different tests can be
applied to different axes.dtm for downstream use in other
functions.topicsPred() function
including num_iteration, sampling_interval,
burn_in.create_new_dtm for creating a new
dtm for new datatopics dimension for training
using textTrainRegression().topicsTest()
incl. x_variable, y_variable and controlspmi_threshold (experimental) to
topicsDtm()split procedure
in the topicsDtm()topicsDtm()p_threshold to p_alphap_alpha from the topicsTest()
function to the topicsPlots() functiontopicsTest()text-packagetopicsPlot().topicsTest().