Challenge: Creating Word Embeddings
Tarea
Swipe to start coding
You have a text corpus stored in corpus
variable. Your task is to train a Word2Vec model to generate word embeddings for the given corpus. To do this:
- Import the class for creating a Word2Vec model.
- Tokenize each sentence in the
'Document'
column of thecorpus
by splitting each sentence into words separated by whitespaces. Store the result in thesentences
variable. - Initialize the Word2Vec model by passing
sentences
as the first argument and setting the following parameters:- embedding size: 50;
- context window size: 2;
- minimal frequency of words to include in the model: 1;
- model: skip-gram.
- Print the top-3 most similar words to the word 'bowl'.
Solución
¿Todo estuvo claro?
¡Gracias por tus comentarios!