Challenge: Clean Messy Reviews
Swipe to start coding
You are given a list of customer review texts in the variable reviews.
The reviews may contain emojis, hashtags, repeated characters, noise words, punctuation, and informal expressions.
Your goal is to create a normalized version of each review using several NLP cleaning steps.
Follow these steps:
- Convert each review to lowercase.
- Remove emojis, hashtags, and mentions using a regular expression.
- Normalize repeated characters: any character repeated 3 or more times should be reduced to a single instance (
coooool→cool). - Tokenize each review using
nltk.word_tokenize(). - Remove stopwords using the provided
stopwordslist. - Apply stemming to the remaining tokens using
PorterStemmer. - Store each cleaned review (joined back with spaces) in a list named
cleaned_reviews.
Make sure the variable cleaned_reviews is declared and contains all normalized reviews in the correct order.
Solução
Obrigado pelo seu feedback!
single
Pergunte à IA
Pergunte à IA
Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo
Can you explain this in simpler terms?
What are some examples related to this topic?
Where can I learn more about this?
Incrível!
Completion taxa melhorada para 8.33
Challenge: Clean Messy Reviews
Deslize para mostrar o menu
Swipe to start coding
You are given a list of customer review texts in the variable reviews.
The reviews may contain emojis, hashtags, repeated characters, noise words, punctuation, and informal expressions.
Your goal is to create a normalized version of each review using several NLP cleaning steps.
Follow these steps:
- Convert each review to lowercase.
- Remove emojis, hashtags, and mentions using a regular expression.
- Normalize repeated characters: any character repeated 3 or more times should be reduced to a single instance (
coooool→cool). - Tokenize each review using
nltk.word_tokenize(). - Remove stopwords using the provided
stopwordslist. - Apply stemming to the remaining tokens using
PorterStemmer. - Store each cleaned review (joined back with spaces) in a list named
cleaned_reviews.
Make sure the variable cleaned_reviews is declared and contains all normalized reviews in the correct order.
Solução
Obrigado pelo seu feedback!
single