How Word2Vec work?



Hi Friends,

I am trying to understand mechanism of Word2Vec for word embedding. I gone through from following links:

But Not clear idea behind Word2Vec or how its work.

I understand Follow steps:

  1. It will take sentence and split in to words.
  2. It will make vocabulary of all those words
  3. But when I see do model['word in vocabulary'] it gives numerical vector.
    1) What is Vector representation?
    2) What is each numerical value represent?

I am confuse about it.

Please if anyone have tutorial or any link from which I can understand better It will be helpful for me.

Thank you so much in advance


Let me address your queries one by one.

Given below is a figure of Word2Vec model with a single word context window. The input layer takes a word in one-hot encoded form. The weights between the input layer and the hidden layer can be represented by a V x N matrix, where V = vocabulary size and N = no. of hidden units.

Each row of this matrix is the N-dimensional vector representation of the associated word of the input layer. Hence, each word in the vocabulary would have an N-dimensional vector representation.


The numbers in a word vector are nothing but the weights learned by the model mentioned above.



@pjoshi15 Thank you so much for your very nice Explanation