Abstract
The kernel method transforms a non-linearly separable problem into a linearly separable problem by mapping low dimensional feature space to a high dimensional feature space. It is used in many algorithms, but the best known member is the support vector machines (SVM). The advantage of kernel method is the cheaper computational expense comparing to computing coordinates of the data explicitly.
It's a multi-part series in which I am planning to cover the following:
- What is kernel SVM?
- Feature Space Mapping
- How kernel tricks work
- Dual Language formulation for kernels
- Common kernels
- Mercer’s Theorem
- Compare different kernels
- Reference