I have to predict cosine similarity between 1 & 2 column into 3rd column how to approach this problem in R



I have 200k rows in dataset and in 1st and 2nd column consist of sentences i have to predict cosine similarity between 1 & 2 column into 3rd column
ex:- 1st column : Why do I love movies so much? Is this strange , 2nd column Why do you love moviesand in 3rd column: 0.70 (which is there cosine value )
Any reference link related to this problem will be helpful
screenshot of data set in the following image


@deva123 First you should create fixed-length vectors for each and every sentence in both the columns. You can create such vectors using bag-of-words approach, tfidf, or word embeddings (word2vec and GLoVE). Once you have these vectors you can easily compute the cosine similarity between the sentences of the two columns.


Thanks for the reply
Do you have any references link for this , I new to this topic and lot of blogs I saw were related to python i’m more familiar to R and most of the sources compare cosine similarity between Documents
I removed only punctuation and stop words

data1 <- data

dd<- sim2(data2$Question.1[1],data2$Question.2[2],method = "cosine",norm=12)


# select 500 rows for faster running times
data_q1 = data2
prep_fun = function(x) {
  x %>% 
    # make text lower case
    str_to_lower %>% 
    # remove non-alphanumeric symbols
    str_replace_all("[^[:alnum:]]", " ") %>% 
    # collapse multiple spaces
    str_replace_all("\\s+", " ")

data_q1[,1:2] <- apply(data_q1[,1:2],2,prep_fun)