Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Tokenization with Regex | Text Preprocessing Fundamentals
Introduction to NLP

bookChallenge: Tokenization with Regex

Task

Swipe to start coding

You are given a message in message variable. You have to tokenize it into words using regex. To do this:

  1. Import necessary class.
  2. Convert message to lowercase and save in message_lower.
  3. Create a Regexp Tokenizer with correct pattern and save it in word_tokenizer.
  4. Tokenize message_lower into words using word_tokenizer.

A word is a sequence of alphanumeric characters and underscores. '#NLPConference_20!', for example, contains one word: NLPConference_20.

Solution

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 6
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Suggested prompts:

Can you explain this in more detail?

What are the main benefits or drawbacks?

Can you give an example?

close

Awesome!

Completion rate improved to 3.45

bookChallenge: Tokenization with Regex

Swipe to show menu

Task

Swipe to start coding

You are given a message in message variable. You have to tokenize it into words using regex. To do this:

  1. Import necessary class.
  2. Convert message to lowercase and save in message_lower.
  3. Create a Regexp Tokenizer with correct pattern and save it in word_tokenizer.
  4. Tokenize message_lower into words using word_tokenizer.

A word is a sequence of alphanumeric characters and underscores. '#NLPConference_20!', for example, contains one word: NLPConference_20.

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 6
single

single

some-alt