Challenge: Stemming
Task
Swipe to start coding
You are given some text in text
variable. Your task is to tokenize this text, remove the stop words, and apply stemming on tokens. To do this:
- Import Porter Stemmer.
- Convert
text
to lowercase and save it intext_lower
. - Tokenize the
text_lower
string and save the result intokens
. - Load English stop words, convert them to
set
, and save instop_words
. - Filter out the stop words using list comprehension and save the result in
filtered_tokens
. - Create a Porter Stemmer and save it in
stemmer
. - Stem the tokens using list comprehension and save the result in
stemmed_tokens
.
Solution
Everything was clear?
Thanks for your feedback!