-
Notifications
You must be signed in to change notification settings - Fork 53
Open
Description
This is great! How can one incorporate dimensionality reduction into the pipeline? For substantive and speed reasons, I'd like to exclude the most and least common words:
corpus = st.CorpusFromPandas(df,
category_col='country',
text_col='text',
nlp=nlp,
# can we discard 1st and 99th percentile of words here?
).build()
Metadata
Metadata
Assignees
Labels
No labels