I have a numerical dataset with +30 features for binary classification. Dataset is quite imbalanced (20% positive) and I am thinking of using a shallow densely connected network for training and 10-fold stratified cross validation for the evaluation. The dataset is related to medicine (cancer classification)
How good of an approach is it to go for?
I doubt that even though cross validation is stratified, mean AUC of 10-fold will have an unfair score due to the lack of positive class samples while testing (25 samples for each fold, and 230 for training).
your help is much appreciated. Thanks