# How do I use the output from t-SNE dimensionality reduction for a random forest

#1

Hello,

I recently came to know about a package called Rtsne which does dimension reduction and found it quite powerful as compared to PCA.
I applied it to the hand written digit recognition problem and this is what I get:

``````Learning embedding...
Iteration 50: error is 97.538078 (50 iterations in 7.58 seconds)
Iteration 100: error is 91.161142 (50 iterations in 11.40 seconds)
Iteration 150: error is 87.047541 (50 iterations in 6.49 seconds)
Iteration 200: error is 86.717745 (50 iterations in 6.39 seconds)
Iteration 250: error is 4.739728 (50 iterations in 6.64 seconds)
Iteration 300: error is 3.154252 (50 iterations in 5.80 seconds)
Iteration 350: error is 2.746375 (50 iterations in 5.83 seconds)
Iteration 400: error is 2.526410 (50 iterations in 7.91 seconds)
Iteration 450: error is 2.383394 (50 iterations in 6.40 seconds)
Iteration 499: error is 2.282333 (50 iterations in 6.55 seconds)
Fitting performed in 70.98 seconds.
``````

What is the error part here?Is it the percentage of data points correctly predicted after aplplying Rtsne.
Also how do I use the output from this into a prediction algo like say RandomForest?

#2

Maybe a little late I found your question, when I was about trying to do the same.

While I was not thinking about the “error part” of a T-SNE visualisation, I found out, how to use the T-SNE results as input for other models in this thread:

Namely,using the “Y” of the TSN-E model (in R-library(Rtsne)), you get two columns with the aggregated information for further modeling. Here the author is adding the Y[1] and Y[2] to his data x:
x = cbind(x, tsne\$Y[,1])
x = cbind(x, tsne\$Y[,2])

Hope that helps!
Best regards,
Samuel