History of SVM
For 20 years, SVMs were outperforming the class of models that are the present cutting-edge, which are now called deep learning methods. The history of deep learning is not one of progressively stronger models, but rather one of progressively more sophisticated training techniques for model that was almost unchanged for nearly 40 years. Deep learning has superseded SVMs in power, but it is a major testament to SVMs that they were approximately as powerful as today’s cutting edge, nearly thirty years ago.
The SVM is an extremely efficient model that uses a clever mathematical trick to significantly improve upon the perceptron method. SVMs take full advantage of the assumption that there are only two classes. The SVM training problem was designed to leverage useful mathematical facts that are true only for binary problems. This can be seen as essentially ‘solving’ the problem of maximizing accuracy for two class problems.
There are two basic forms that an SVM is defined in. These are called the primal and dual forms. Each has its own advantages and disadvantages. The primal form is older and easier to use, but less powerful. The dual form requires a bit more power to solve, and forms the basis for kernel SVM. Kernel SVM is an extension that is the most commonly used variation of SVM, and is one of the biggest and most important successes in machine learning.