Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Challenge: Tokenizing Using Regex | Text Preprocessing Fundamentals
Introduction to NLP
course content

Course Content

Introduction to NLP

Introduction to NLP

1. Text Preprocessing Fundamentals
2. Stemming and Lemmatization
3. Basic Text Models
4. Word Embeddings

bookChallenge: Tokenizing Using Regex

Task

Given a string named message, convert it lowercase, then tokenize it into words using regular expression tokenization and the corresponding nltk class. A word is a sequence of only alphanumeric characters (letters and numbers). '#Conference2023!', for example, contains one word: Conference2023.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 6
toggle bottom row

bookChallenge: Tokenizing Using Regex

Task

Given a string named message, convert it lowercase, then tokenize it into words using regular expression tokenization and the corresponding nltk class. A word is a sequence of only alphanumeric characters (letters and numbers). '#Conference2023!', for example, contains one word: Conference2023.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 6
toggle bottom row

bookChallenge: Tokenizing Using Regex

Task

Given a string named message, convert it lowercase, then tokenize it into words using regular expression tokenization and the corresponding nltk class. A word is a sequence of only alphanumeric characters (letters and numbers). '#Conference2023!', for example, contains one word: Conference2023.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Task

Given a string named message, convert it lowercase, then tokenize it into words using regular expression tokenization and the corresponding nltk class. A word is a sequence of only alphanumeric characters (letters and numbers). '#Conference2023!', for example, contains one word: Conference2023.

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Section 1. Chapter 6
Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
some-alt