align_lda_topics() estimates LDA models for a range of topic numbers
and aligns the topics across models using the product and transport method.
Usage
align_lda_topics(
ta,
k_range = 2:10,
seed = 42,
min_prevalence = 0.05,
taxon = taxon_id,
sample = sample_id,
...
)Arguments
- ta
A tidytacos object.
- k_range
A range of topic numbers to estimate.
- seed
A seed for reproducibility.
- min_prevalence
The lowest percentage (0-1) of samples taxa need to be present in for the taxa to be used in model estimation.
- taxon
The column name in the taxa table with taxa identifiers.
- sample
The column name in the sample table with sample identifiers.
- ...
Arguments passed on to
alto::run_lda_modelsdata(required) a
matrix,data.frameorslam::simple_triplet_matrixcontaining the counts (integers) of each feature (e.g. words) and each sample (or document). If data is provided asmatrixordata.frame, each row is a sample, each column is a feature.lda_varying_params_lists(required) a
listspecifying the parameter for each models that needs to be ran. Currently, supported parameters are "k" (the number of topic), "method" ("VEM" or "Gibbs"), and "control", a list of typeLDAcontrol. Seetopicmodels::LDAfor details and below for examples.lda_fixed_params_list(optional) a
listspecifying the parameters common to all models to be fitted. Values provided bylda_fixed_params_listare overwritten by those provided bylda_varying_params_lists.dir(optional) a
characterspecifying the directory in which individual LDA models should be stored. If not specified, individual LDA models are not stored. This option is especially useful for data exploration as it allows to save execution time if one wishes to add models to an existing model list. (see examples)reset(optional, default =
FALSE). Should any cached models in the save directory be cleared?verbose(optional, default =
FALSE) Print verbose output while running models?
