Least Square Problem

$$ \min {\beta} \sum{i=1}^{n}\left(y_{i}-x_{i}^{\top} \beta\right)^{2} $$

Least absolute deviation

$$ \min {\beta} \sum{i=1}^{n}\left|y_{i}-x_{i}^{\top} \beta\right| $$

Regularized least squares

范数

Desire solution β to be sparse aka with small ∥β∥0 i.e. few non-zero coefficients.

Why? Only few features are relevant, require correspondingly few data points, . . .