How do I use the output from t-SNE dimensionality reduction for a random forest

r-tsne
dimensionreduction

#1

Hello,

I recently came to know about a package called Rtsne which does dimension reduction and found it quite powerful as compared to PCA.
I applied it to the hand written digit recognition problem and this is what I get:


However,I wanted to know a few things about this output:

Learning embedding...
Iteration 50: error is 97.538078 (50 iterations in 7.58 seconds)
Iteration 100: error is 91.161142 (50 iterations in 11.40 seconds)
Iteration 150: error is 87.047541 (50 iterations in 6.49 seconds)
Iteration 200: error is 86.717745 (50 iterations in 6.39 seconds)
Iteration 250: error is 4.739728 (50 iterations in 6.64 seconds)
Iteration 300: error is 3.154252 (50 iterations in 5.80 seconds)
Iteration 350: error is 2.746375 (50 iterations in 5.83 seconds)
Iteration 400: error is 2.526410 (50 iterations in 7.91 seconds)
Iteration 450: error is 2.383394 (50 iterations in 6.40 seconds)
Iteration 499: error is 2.282333 (50 iterations in 6.55 seconds)
Fitting performed in 70.98 seconds.

What is the error part here?Is it the percentage of data points correctly predicted after aplplying Rtsne.
Also how do I use the output from this into a prediction algo like say RandomForest?


#2

Maybe a little late I found your question, when I was about trying to do the same.

While I was not thinking about the “error part” of a T-SNE visualisation, I found out, how to use the T-SNE results as input for other models in this thread:

Namely,using the “Y” of the TSN-E model (in R-library(Rtsne)), you get two columns with the aggregated information for further modeling. Here the author is adding the Y[1] and Y[2] to his data x:
x = cbind(x, tsne$Y[,1])
x = cbind(x, tsne$Y[,2])

Hope that helps!
Best regards,
Samuel