Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Tokenization with Regex | Text Preprocessing Fundamentals
Introduction to NLP

Swipe to show menu

book
Challenge: Tokenization with Regex

Task

Swipe to start coding

You are given a message in message variable. You have to tokenize it into words using regex. To do this:

  1. Import necessary class.
  2. Convert message to lowercase and save in message_lower.
  3. Create a Regexp Tokenizer with correct pattern and save it in word_tokenizer.
  4. Tokenize message_lower into words using word_tokenizer.

A word is a sequence of alphanumeric characters and underscores. '#NLPConference_20!', for example, contains one word: NLPConference_20.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 6
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

close

Awesome!

Completion rate improved to 3.45

book
Challenge: Tokenization with Regex

Task

Swipe to start coding

You are given a message in message variable. You have to tokenize it into words using regex. To do this:

  1. Import necessary class.
  2. Convert message to lowercase and save in message_lower.
  3. Create a Regexp Tokenizer with correct pattern and save it in word_tokenizer.
  4. Tokenize message_lower into words using word_tokenizer.

A word is a sequence of alphanumeric characters and underscores. '#NLPConference_20!', for example, contains one word: NLPConference_20.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

close

Awesome!

Completion rate improved to 3.45

Swipe to show menu

some-alt