I was going through word2vec materials from Andrew Ng’s course and below is what i understood.
- A matrix of shape embedding_size*number_of_unique_words is created and populated with random decimal values
- For each word, one hot encoded array of size number_of_unique_words*1 is created.
- We also pull the matrix embedding_size*1 from the matrix created in previous step.
- Multplied through matrix multiplication with the matrix created in matrix created in both the previuos step. This gives an array of shape embedding_size*1.
- For each target word, final embedding created in step2 is extracted for context words and fed into hidden layer and in the output layer we predict the wprd.
The challenge i am facing is i am unable to find a clue how embedding matrix is updated. For example, if there are 10000 unique words and 200 is embedding size, this embedding after training process will have some decimal value in each cell of the matrix. So for a particular word lets say
happy, we can have an array of sze 200 which will have some decimal values
So my question is how this matrix is created?