How `subsample` parameter affect XGBRegressor for a time series data

xgboost

#1

Hello,

I was going through this article about parameter tuning in Xgboost.

Here we have a parameter called subsample which is the fraction of observations to be randomly samples for each tree.

If we have data which has a temporal nature (time series data), how this parameter would affect ?
If we allow algo to randomly sample data, then we loose the temporal nature of the data. So what should be the value of subsample ?
Should we set it 1 ?
Any suggestions are most welcome.


#2

Hi @sachinkalsi,

Preferably set subsample to 1 in case of time series data.
On a lighter note, does your time series data have other features (except the time column)? Otherwise using xgb instead of Arima or Sarimax would not give you good results.