Dimensionality reduction

This is great! How can one incorporate dimensionality reduction into the pipeline? For substantive and speed reasons, I'd like to exclude the most and least common words:

corpus = st.CorpusFromPandas(df,
                             category_col='country',
                             text_col='text',
                             nlp=nlp,
                             # can we discard 1st and 99th percentile of words here?
                             ).build()




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dimensionality reduction #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Dimensionality reduction #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions