I would answer by applying ceteris paribus - i.e. with all other things kept constant, how would ntree = 250 differ from ntree = 100.

Assume there are two data sets. A and B. Both have the same # of preditors (columns) but A has 1000 records (rows) while B has 100,000 records. In other words A is a small data set while B is quite large. Now let us apply random forest to both keeping all other function parameters same except ntree.

Case 1. On A, ntree = 250, On B ntree = 250

Case 2. On A ntree = 100, On B ntree = 100.

Random Forest algorithm uses a bootstrap sample. If we choose to build 250 trees, what will happen is every record that was NOT selected into the train sample (and hence went to the Out of Bag sample) will be scored. Since this selection is purely random, we will never get 250 predictions for every record. Record # 24 might be predicted 200 times, Record #305 might be predicted 206 timesâ€¦ it is random. Eventually majority voting is employed to decide on the final prediction for each record.

So to answer the question at hand, as long as we ensure that each record has a good enough chance to appear in the Out of Bag sample a sufficient number of times, we are good! As you can see, the choice of number of trees then depends on how large our data set it. In the two cases we above, I would suspect even 250 trees may not be enough for dataset B which has 100K records.

This prods us to another question then - is there a mathematical formula or rule of thumb that gives us the relationship between # of records and number of trees to build to make our lives easier? Till now I havenâ€™t come across such a thing and continue to search:-) Trial and error it is.