How are conditional prob calculated for numeric variables in Naive Bayes




For Naive Bayes in classification conditional probabilities for categories are calculated using the frequency values.For example in the below code:

nb <- naiveBayes(churn~.,churnTrain)

on seeing the output:

So for P( churn = “yes” |intl_plan = “yes” ) the conditional prob is drawn from the counts in the dataset.

The result is the same as the part marked in red in image 1.
But for a numeric/continuous variable like number_vmail_messages how is this calculated??
Can someone please clarify this to me.??



There are few assumptions for Naive Bayes classifier that can help you understand this:

  • Independence of the predictor variables,
  • Continuous variable has normal distribution

For continuous variables, we assumes distribution is normal and then calculate mean and standard deviation and after that z-value then look for probabilities in z-table hence probabilities can be estimated for each of your continuous variables to make the naive Bayes classifier.



Hi @Imran,

thanks a lot for the reply.

Could you also please explain how the z-scores are calculated.I mean why are there ,1 and ,2 as shown in the image above.Is the data being divided into two parts,under each class(yes/no)?
Also,the values shown are not prob values,so are these the z-scores??