Hello,

While using Support Vector Machines (SVM), How and when do you decide to transform data to a higher dimension?

What are the indicators? What are the criteria to decide that?

Regards,

Mohit

Hello,

While using Support Vector Machines (SVM), How and when do you decide to transform data to a higher dimension?

What are the indicators? What are the criteria to decide that?

Regards,

Mohit

1 Like

Hi, SVM is the kernelized maximum margin classifier (MMC) with a penalty term on slackness parameters.

When:

In fact, MMC only works for linear separable case. For nonlinear separable case, two options you can do. 1) adding new features by some nonlinear feature mappings, e.g. polynomial term. 2) adding a penalty term on slackness parameters. MMC will work again if you apply 1) or 2) or both.

How:

Adding explicit feature mapping depends on visualization and grant knowledge. It is hard in practice. However, you can indirectly introduce non-linear feature mappings through kernel function which returns the inner product of mutual observations in the augmented feature space.

Embedding kernel function into existing algorithm is called kernel trick. For example, kernel PCA, kernel regression. One drawback of kernel methods family is they all are memory based method, i.e. prediction depends on training data, except SVM. SVM is also called sparse kernel method, since only the observations standing on the margin contribute to the modelling. That is also why we called it support vectors machine.

Summarize: SVM builds up the model in an augmented feature space which obtained through kernel function. The ‘shape’ of augmented feature space is controlled through kernel parameters. For example, SVM with rbf kernel function, the rbf kernel function is driven by the parameter sigma.

Remark 1: SVM with rbf kernel and penalty term is the most popular one. The kernel parameter sigma can be estimated somehow through the data. So you can focus on tuning the cost parameter of penalty first, then refine the kernel parameter.

How to decide: depends on your problem. Accuracy, kappa and so on. You can tune the hyper parameter through cross validation.

Remark 2: SVM also can be applied to regression problem.