Challenge: Clean Messy Reviews
Swipe to start coding
You are given a list of customer review texts in the variable reviews.
The reviews may contain emojis, hashtags, repeated characters, noise words, punctuation, and informal expressions.
Your goal is to create a normalized version of each review using several NLP cleaning steps.
Follow these steps:
- Convert each review to lowercase.
- Remove emojis, hashtags, and mentions using a regular expression.
- Normalize repeated characters: any character repeated 3 or more times should be reduced to a single instance (
coooool→cool). - Tokenize each review using
nltk.word_tokenize(). - Remove stopwords using the provided
stopwordslist. - Apply stemming to the remaining tokens using
PorterStemmer. - Store each cleaned review (joined back with spaces) in a list named
cleaned_reviews.
Make sure the variable cleaned_reviews is declared and contains all normalized reviews in the correct order.
Soluzione
Grazie per i tuoi commenti!
single
Chieda ad AI
Chieda ad AI
Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione
Fantastico!
Completion tasso migliorato a 8.33
Challenge: Clean Messy Reviews
Scorri per mostrare il menu
Swipe to start coding
You are given a list of customer review texts in the variable reviews.
The reviews may contain emojis, hashtags, repeated characters, noise words, punctuation, and informal expressions.
Your goal is to create a normalized version of each review using several NLP cleaning steps.
Follow these steps:
- Convert each review to lowercase.
- Remove emojis, hashtags, and mentions using a regular expression.
- Normalize repeated characters: any character repeated 3 or more times should be reduced to a single instance (
coooool→cool). - Tokenize each review using
nltk.word_tokenize(). - Remove stopwords using the provided
stopwordslist. - Apply stemming to the remaining tokens using
PorterStemmer. - Store each cleaned review (joined back with spaces) in a list named
cleaned_reviews.
Make sure the variable cleaned_reviews is declared and contains all normalized reviews in the correct order.
Soluzione
Grazie per i tuoi commenti!
single