Home
Home
Posts List
  1. Supervised Learning
    1. Template Matching
      1. Nearest Mean
      2. K-Nearest Neighbor
    2. Bayes plug-in
      1. Gaussian Classifier
      2. Naive Gaussian
      3. Gaussian Mixture Model (GMM)
    3. Discriminant Function
      1. Neural Network
      2. Support Vector Machine
    4. Validation

[Theory] [Detection and Pattern Recognition] ch4

Supervised Learning

Template Matching

Nearest Mean

Note:

  • Mahalanobis Distance is robust to feature transforms
  • multi-class classifier
  • fails in non-linear-separable dataset.

K-Nearest Neighbor

Note:

  • simple and very robust
  • k=1 always leads to overfitting
  • multi-class classifier
  • search for nearest neighbor is super expensive, need special data structure like kd-tree, octree to speed up.

Bayes plug-in

Gaussian Classifier

Note:

  • becomes nearest mean if all classes have identical C and $\mu$
  • need lots of data, usually around 10 times the vector length d
  • fails in non-linear-separable dataset.

Naive Gaussian

just assume C is diagonal, reduce training time and amount of required data.

Gaussian Mixture Model (GMM)

Note:

  • versatile, can approximate many real-life distributions
  • sensitive to the choice or estimation of model orders Mj
  • nonconvex optimization, possible convergence to local optimum, sensitive to initialization of EM

Discriminant Function

This approach can be considered as directly modeling posteriori with estimating the likelihood.

Neural Network

see course deep learning

Support Vector Machine

basic ideas:

  • binary classifier
  • use a linear discriminant function

new ideas:

  • non-linear feature mapping (kernel functions)
  • maximum margin (instead of least squares)
  • convex optimization

Hard margin SVM:

Note:

  • dataset must be linear separable

Soft margin SVM:

Note:

  • can solve significantly larger set of problems
  • sensitive to the choice of hyperparameters $\gamma$ and C. They have to be optimized

Multi-class SVM:

  1. one against one

    • need to train $\tbinom{c}{2}$ SVMs for all pairs of class
    • the class with most wins wins
  2. one against the rest

    • need to train c SVMs
    • choose the class with the highest f(x)
  3. hierarchical

Validation

k-fold cross validation: (k − 2)/1/1 folds for training/validation/test