Multiclass classification using logistic regression in python



I am trying to apply logistic regression on the human activity regression data. It consists of 6 levels of outcome describing different activities, After applying PCA on train and test combined data(continuous variables only) I ended up with 150 principal components from 562 variables. Now I am trying to apply logistic regression for classification using :

from sklearn import linear_model clf=linear_model.LogisticRegression(C=1e5),train.activity)

In the above code train_x consists of the 150 proncipal components and train.activity is the outcome. After predicting it on test data set using :


I am getting an array with minimum value of 1 and max value of 6. If there were two outcomes, I would have tried to find a suitable threshold to classify, But here there are 6 outcomes. Please suggest a way to apply logistic regression to classify on 6 possible outcomes.
Thanks in advance


@syed.danish - You can try to plot ROC curve to find the correct threshold value.

ROC curve - In statistics, a receiver operating characteristic (ROC), or ROC curve, is a graphical plot that illustrates the performance of a binary classifier system as its discrimination threshold is varied. The curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings.

Hope this helps!



I think you didn’t understand my question, Let me explain this with an example. Suppose the outcome variable(label) is :

cow buffalo goat goat buffalo

Now we have to predict weather the label is cow, buffalo or goat. So what I am trying to ask is, can we classify it using logistic regression?
I came up with one solution, i.e, we can create dummy variables for outcome variable(label) and predict the probability of each dummy variable and one with the highest probability will be the final outcome.
Please check whether the above approach will work or not?
Syed Danish


@syed.danish - Sorry for not understanding your question and yes you can use this approach create a dummy variable for each class and using this you can predict the outcome.

Hope this helps!