How to reduce False Positive and False Negative in binary classification

Hi All,
This is my result of the Logistic Regression Model, I am not worried about accuracy as of now, but False Positive is very high (marked red), I want to bring down False Positive, what should I do ?


1 Like

adjust the class weights and try ensemble(random forest,LightGBM) methods.

Try balancing the response column classes using stratified or oversampling,that should help

1 Like

I did oversampling and SMOTE on training data, but can do that on test data , this is the result I am getting after SMOTE only.

I am not sure how to do that, can share any link or tutorial for that.

First if data is imbalanced do oversampling or under sampling(SMOTE),just before model building.

  1. firstly random forest overfits if the training data and testing data are not drawn from same distribution.

  2. check the data for linearity,multicollinearity ,outliers,etc

  3. after that go for non linear algorithms like boosting,nueral network etc using cross validation

  4. do hyperparameter tunning.

  5. find why you have lots of FP and FN do EDA. find why this is happening.

  6. understnad the mathemathical concept of all ml algorithms then you will get clear idea

1 Like

Alright. Based on your information, firstly you should try using the classification such SVM to reduce these random with cross validation it will gave you a good way for reduce it. Or other idea use the cluster to filter your data by using Unsupervised method.

HI Vijay,

which solution worked can u please drop

I think there is a potential case for working around with your cut off as in finding the optimal cutoff value. you can generate that by building a simple TPR VS TNR graph for each cutoff value between 0 n 1. Let me know if you require r pr python code for this … I have something ready . Best,

© Copyright 2013-2019 Analytics Vidhya