- 機器學習簡介
- 線性模型(迴歸分析)
- Regularization
- Support Vector Machine
- 分類樹
- Bagging 與 Random Forest
- Boosting 與 Gradient Boosted Decision Tree
- Neuron Network
Wush Wu
國立台灣大學
...
y | x | z |
---|---|---|
-3.449232 | -0.6264538 | 0.3981059 |
1.620552 | 0.1836433 | -0.6120264 |
-3.757597 | -0.8356286 | 0.3411197 |
3.224672 | 1.5952808 | -1.1293631 |
4.443377 | 0.3295078 | 1.4330237 |
-1.837960 | -0.8204684 | 1.9803999 |
3.248735 | 0.4874291 | -0.3672215 |
2.965666 | 0.7383247 | -1.0441346 |
4.317580 | 0.5757814 | 0.5697196 |
-2.939782 | -0.3053884 | -0.1350546 |
y
x
與y
x
與y
x
與y
的關係x
與y
的關係x
與y
的關係y
到x
與y
y
=> 用常數做預測x
與y
=> 用f(x)
做預測
f
越複雜,結果不一定越好f(x)
如果不夠複雜:lack of fitf(x)
如果過度複雜:overfittingz
與y
z
與y
y
=> z
與y
z
並沒有包含太多y
的資訊
z
v.s. y
的圖可以看出x
、z
與y
Dependent variable: | |
dist | |
speed | 3.932*** |
(0.416) | |
Constant | -17.579** |
(6.758) | |
Observations | 50 |
R2 | 0.651 |
Adjusted R2 | 0.644 |
Residual Std. Error | 15.380 (df = 48) |
F Statistic | 89.567*** (df = 1; 48) |
Note: | *p<0.1; **p<0.05; ***p<0.01 |
lm
與glm
Optional-RMachineLearning-01-Linear-Model
Optional-RMachineLearning-02-Generalized-Linear-Model
[[1]]
[1] 25 32 46 67 22 66 74 51 49 7 22 20 52 33 57 41 54 87 33 57
[[2]]
[1] 72 22 50 17 26 33 1 33 63 30 39 47 40 21 60 51 58 15 54 35
[[3]]
[1] 60 50 57 44 43 58 2 39 55 52 39 62 37 24 9 14 29 42 51 34
[1] 25 32 46 67 22 66 74 51 49 7 22 20 52 33 57 41 54 87 33 57
=> [1] 44.75
[1] 72 22 50 17 26 33 1 33 63 30 39 47 40 21 60 51 58 15 54 35
=> [1] 38.35
[1] 60 50 57 44 43 58 2 39 55 52 39 62 37 24 9 14 29 42 51 34
=> [1] 40.05
[1] 68 27 38 30 50 25 39 56 11 63 30 61 31 30 39 65 63 33 57 77
=> [1] 44.65
[1] 36 54 34 29 56 22 53 17 24 18 24 7 50 63 57 58 38 35 59 48
=> [1] 39.1
[1] 50 31 26 88 49 22 18 39 70 47 81 55 31 36 19 1 54 14 37 50
=> [1] 40.9
[1] 87 40 40 20 56 38 42 22 23 47 46 10 3 50 71 47 45 43 84 41
=> [1] 42.75
[1] 52 47 24 25 55 37 20 55 15 63 48 45 30 37 41 21 43 10 26 22
=> [1] 35.8
樣本平均值 - 母體平均值
樣本的變異數
Bias^2 + Variance = MSE
)\[dist = \beta_0 + \beta_1 speed + \beta_2 speed^2 + \beta_3 speed^3\]
\[dist = \beta_0 + \beta_1 speed + \beta_2 speed^2 + \beta_3 speed^3 + 10 (\beta_1^2 + \beta_2^2 + \beta_3^2)\]
Optional-RMachineLearning-03-Regularization
\[l(y, f(x) = (y - f(x))^2\]
\[l(y, f(x)) = \left\{\begin{array}{lc} 0 & \text{ if } \left\lVert y - f(x) \right\rVert < \varepsilon \\ \left\lVert y - f(x) \right\rVert - \varepsilon & \text{ otherwise } \end{array}\right.\]
Optional-RMachineLearning-04-Support-Vector-Machine
Optional-RMachineLearning-05-Decision-Tree
https://citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics/
Optional-RMachineLearning-06-Gradient-Boosted-Decision-Tree