PCA on correlated environmental variables for a SDM - presence and absence points

r
machine_learning

#1

I have a series of environmental data layers in r that I am trying to do a PCA on to reduce dimensionality for my SDM (species distribution model). I am attempting to do this in R in order to run maxent with the dismo package.

All of the papers I have read that do this technique do not give much detail regarding how to accomplish this. My specific questions are as follows:

  1. when you look for correlations between variables, do you use both presence and absence/background points together or do correlations for each separately?

  2. Regarding absence points, I am using background points for maxent, so they are not true absence points. Therefore I can select as many points as I want and I’m finding confounding information regarding how many points to use. Maxent uses 10,000 by default - should I be using 10,000 in my correlation and PCA?