25,161
edits
Changes
no edit summary
In each iteration, feature weights are updated on the basis of the [https://en.wikipedia.org/wiki/Partial_derivative partial derivatives] of the objective function.
[[FILE:mmtoPartialDifferentation1.jpg|none|text-bottom]]
The J<sub>R</sub> derivative is treated in an intuitive manner [https://en.wikipedia.org/wiki/Sign_function sgn](ω<sub>i</sub>)λ<sub>1</sub> for ω<sub>i</sub> &Element#8712; ω<nowiki>''</nowiki>, and 0 otherwise.
The partial derivative of the [https://en.wikipedia.org/wiki/Constraint_(mathematics) constraint] term J<sub>C</sub> is 0 for ω<sub>i</sub> &NotElement#x2209; ω'.
Otherwise, the [https://en.wikipedia.org/wiki/Lagrange_multiplier Lagrange multiplier] λ<sub>0</sub> is set to the [https://en.wikipedia.org/wiki/M-estimator#Median median] of the partial derivatives
in order to maintain the constraint g(ω) = 0 in each iteration. As a result, '''∆ω′i ′<sub>i</sub>''' is '''h''' for '''n''' feature weights, '''−h''' for '''n''' feature weights, and 0 in one feature weight, where the number of feature weights in ω′ ' is 2n + 1.
Since the objective function with the minimax values s(p, ω) is not always [https://en.wikipedia.org/wiki/Differentiable_function differentiable],