SVM and Neural Networks




I understand that the SVM is gobal optimum whereas neural networks are multiple local optimum. Can anyone please elaborate on the same.

Also how are the above two optimums a disadvantage for both SVM and neural networks.




Let me give a brief description of what local and global optimum is, and then explain how SVM and neural networks do optimization.

The above image gives you a glimpse of how data is arranged in real life. For simplicity, it is shown to be a 3-D image, but in reality it is multi-dimensional and can’t be visualized through classical techniques. You can see that it has multiples “hills” and “valleys”, which are of differences sizes and depths. In this image, the lowermost (aka which has maximum depth) is the global optimum lower limit, where all other valleys are deemed as local optimum.

Now what SVM and neural networks do it, transform the data in such a way that it becomes easier to do the job they are supposed to. For example,

In the given image, if you convert the co-ordinate from cartesian to polar, you can easily divide the classes into two regions. For this transformation, SVM uses kernels, whereas neural networks use individual neurons

Now you have to understand that the “number” of optimums remaining after transforming the data solely depend on how you have set up your algorithm, i.e. The type of kernel you use in SVM, or the number of neurons in hidden layer of a neural network. But it should be clear that both of them will have local optimum.

In practical sense, you would see that kernels are more easily tuned to achieve global optimum, whereas neural networks have more hyperparameters and are harder to tune.

Source for image 1
Source for image 2