What is the difference between predict and predict_proba?



I am new to Python Sklearn package. Can anyone tell me what is the difference between the below predict functions in Python. Have googled it, but could not understand properly.




Consider a binary classification for labels 0 and 1.

Predict will give either 0 or 1 as output
Predict_proba will give the only probability of 1.


Hi @kanha14 ,

I applied predict_proba on classfication problem. Its giving a list of 2 outputs for each observation(row) as below.

[0.23780654318010663, 0.7621934568198934]

What does this mean?

Does it mean, the probability of occurance of 0 is 0.237… and 1 is 0.762…?


Hi @satyadeep9123,

predict_proba gives you the probabilities for the target (0 and 1 in your case) in array form. The number of probabilities for each row is equal to the number of categories in target variable (2 in your case).

Yes, here 0,237… is the probability that the output will be 0 and 0.762… is the probability of output being 1.

Suppose you only want the probability of getting the output either as 0 or 1, you can do some changes in your code and you will get only one output for each observation. You can use the following code:


model is the trained model (name may vary in your case, so change it accordingly)
test is the dataset i made predictions for (change it according to your dataset)

Using [:,1] in the code will give you the probabilities of getting the output as 1. If you replace 1 with 0 in the above code, you will only get the probabilities of getting the output as 0.


Thanks @PulkitS