Package index
-
step_tokenize()
- Tokenization of Character Variables
-
step_tokenize_bpe()
- BPE Tokenization of Character Variables
-
step_tokenize_sentencepiece()
- Sentencepiece Tokenization of Character Variables
-
step_tokenize_wordpiece()
- Wordpiece Tokenization of Character Variables
-
step_untokenize()
- Untokenization of Token Variables
-
step_lemma()
- Lemmatization of Token Variables
-
step_ngram()
- Generate n-grams From Token Variables
-
step_pos_filter()
- Part of Speech Filtering of Token Variables
-
step_stem()
- Stemming of Token Variables
-
step_stopwords()
- Filtering of Stop Words for Tokens Variables
-
step_tokenfilter()
- Filter Tokens Based on Term Frequency
-
step_tokenmerge()
- Combine Multiple Token Variables Into One
-
step_lda()
- Calculate LDA Dimension Estimates of Tokens
-
step_texthash()
- Feature Hashing of Tokens
-
step_tf()
- Term frequency of Tokens
-
step_tfidf()
- Term Frequency-Inverse Document Frequency of Tokens
-
step_word_embeddings()
- Pretrained Word Embeddings of Tokens
-
step_dummy_hash()
- Indicator Variables via Feature Hashing
-
step_sequence_onehot()
- Positional One-Hot encoding of Tokens
-
step_textfeature()
- Calculate Set of Text Features
-
step_text_normalization()
- Normalization of Character Variables
-
step_clean_levels()
- Clean Categorical Levels
-
step_clean_names()
- Clean Variable Names
-
tokenlist()
- Create Token Object
-
show_tokens()
- Show token output of recipe
-
all_tokenized()
all_tokenized_predictors()
- Role Selection
-
count_functions
- List of all feature counting functions
-
emoji_samples
- Sample sentences with emojis