Challenge: Stop Words
Aufgabe
Swipe to start coding
You are given some text in text
variable. Your task is to tokenize it and remove the stop words. To do this:
- Import necessary components.
- Convert it to lowercase and save in
text_lower
. - Load the English stop words list from
nltk
, convert it to aset
, and save it instop_words
. - Tokenize the
text_lower
string using theword_tokenize()
function and save the result intokens
. - Filter out the stop words from
tokens
using list comprehension and save the result intokens_clean
.
Lösung
War alles klar?
Danke für Ihr Feedback!