Step Functions - character to tokenlist

step_tokenize() tidy(<step_tokenize>)

Tokenization of character variables

Step Functions - tokenlist to character

step_untokenize() tidy(<step_untokenize>)

Untokenization of tokenlist variables

Step Functions - tokenlist to tokenlist

step_stem() tidy(<step_stem>)

Stemming of tokenlist variables

step_stopwords() tidy(<step_stopwords>)

Filtering of stopwords from a tokenlist variable

step_lemma() tidy(<step_lemma>)

Lemmatization of tokenlist variables

step_tokenfilter() tidy(<step_tokenfilter>)

Filter the tokens based on term frequency

step_tokenmerge() tidy(<step_tokenmerge>)

Generate the basic set of text features

step_pos_filter() tidy(<step_pos_filter>)

Part of speech filtering of tokenlist variables

step_ngram() tidy(<step_ngram>)

Generate ngrams from tokenlist

Step Functions - tokenlist to numeric

step_tfidf() tidy(<step_tfidf>)

Term frequency-inverse document frequency of tokens

step_tf() tidy(<step_tf>)

Term frequency of tokens

step_texthash() tidy(<step_texthash>)

Term frequency of tokens

step_word_embeddings() tidy(<step_word_embeddings>)

Pretrained word embeddings of tokens

Step Functions - character to numeric

step_textfeature() tidy(<step_textfeature>)

Generate the basic set of text features

step_sequence_onehot() tidy(<step_sequence_onehot>)

Generate the basic set of text features

step_lda() tidy(<step_lda>)

Calculates lda dimension estimates

Step Functions - character to character

step_text_normalization() tidy(<step_text_normalization>)

text_normalizationming of tokenlist variables

Tokenlist functions


Create tokenlist object