Struct argmin::solver::quasinewton::LBFGS
source · pub struct LBFGS<L, P, G, F> { /* private fields */ }
Expand description
§Limited-memory BFGS (L-BFGS) method
L-BFGS is an approximation to BFGS which requires a limited amount of memory. Instead of storing the inverse, only a few vectors which implicitly represent the inverse matrix are stored.
It requires a line search and the number of vectors to be stored (history size m
) must be
set. Additionally an initial guess for the parameter vector is required, which is to be
provided via the configure
method of the
Executor
(See IterState
, in particular IterState::param
).
In the same way the initial gradient and cost function corresponding to the initial parameter
vector can be provided. If these are not provided, they will be computed during initialization
of the algorithm.
Two tolerances can be configured, which are both needed in the stopping criteria.
One is a tolerance on the gradient (set with
with_tolerance_grad
): If the norm of the gradient is below
said tolerance, the algorithm stops. It defaults to sqrt(EPSILON)
.
The other one is a tolerance on the change of the cost function from one iteration to the
other. If the change is below this tolerance (default: EPSILON
), the algorithm stops. This
parameter can be set via with_tolerance_cost
.
§Orthant-Wise Limited-memory Quasi-Newton (OWL-QN) method
OWL-QN is a method that adapts L-BFGS to L1-regularization. The original L-BFGS requires a
loss function to be differentiable and does not support L1-regularization. Therefore,
this library switches to OWL-QN when L1-regularization is specified. L1-regularization can be
performed via with_l1_regularization
.
TODO: Implement compact representation of BFGS updating (Nocedal/Wright p.230)
§Requirements on the optimization problem
The optimization problem is required to implement CostFunction
and Gradient
.
§Reference
Jorge Nocedal and Stephen J. Wright (2006). Numerical Optimization. Springer. ISBN 0-387-30303-0.
Galen Andrew and Jianfeng Gao (2007). Scalable Training of L1-Regularized Log-Linear Models, International Conference on Machine Learning.
Implementations§
source§impl<L, P, G, F> LBFGS<L, P, G, F>where
F: ArgminFloat,
impl<L, P, G, F> LBFGS<L, P, G, F>where
F: ArgminFloat,
sourcepub fn with_tolerance_grad(self, tol_grad: F) -> Result<Self, Error>
pub fn with_tolerance_grad(self, tol_grad: F) -> Result<Self, Error>
The algorithm stops if the norm of the gradient is below tol_grad
.
The provided value must be non-negative. Defaults to sqrt(EPSILON)
.
§Example
let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_tolerance_grad(1e-6)?;
sourcepub fn with_tolerance_cost(self, tol_cost: F) -> Result<Self, Error>
pub fn with_tolerance_cost(self, tol_cost: F) -> Result<Self, Error>
Sets tolerance for the stopping criterion based on the change of the cost stopping criterion
The provided value must be non-negative. Defaults to EPSILON
.
§Example
let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_tolerance_cost(1e-6)?;
sourcepub fn with_l1_regularization(self, l1_coeff: F) -> Result<Self, Error>
pub fn with_l1_regularization(self, l1_coeff: F) -> Result<Self, Error>
Activates L1-regularization with coefficient l1_coeff
.
Parameter l1_coeff
must be > 0.0
.
§Example
let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_l1_regularization(1.0)?;
Trait Implementations§
source§impl<'de, L, P, G, F> Deserialize<'de> for LBFGS<L, P, G, F>
impl<'de, L, P, G, F> Deserialize<'de> for LBFGS<L, P, G, F>
source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
source§impl<O, L, P, G, F> Solver<O, IterState<P, G, (), (), (), F>> for LBFGS<L, P, G, F>where
O: CostFunction<Param = P, Output = F> + Gradient<Param = P, Gradient = G>,
P: Clone + ArgminSub<P, P> + ArgminSub<F, P> + ArgminAdd<P, P> + ArgminAdd<F, P> + ArgminDot<G, F> + ArgminMul<F, P> + ArgminMul<P, P> + ArgminMul<G, P> + ArgminL1Norm<F> + ArgminSignum + ArgminZeroLike + ArgminMinMax,
G: Clone + ArgminL2Norm<F> + ArgminSub<G, G> + ArgminAdd<G, G> + ArgminAdd<P, G> + ArgminDot<G, F> + ArgminDot<P, F> + ArgminMul<F, G> + ArgminMul<F, P> + ArgminZeroLike + ArgminMinMax,
L: Clone + LineSearch<P, F> + Solver<LineSearchProblem<O, P, G, F>, IterState<P, G, (), (), (), F>>,
F: ArgminFloat,
impl<O, L, P, G, F> Solver<O, IterState<P, G, (), (), (), F>> for LBFGS<L, P, G, F>where
O: CostFunction<Param = P, Output = F> + Gradient<Param = P, Gradient = G>,
P: Clone + ArgminSub<P, P> + ArgminSub<F, P> + ArgminAdd<P, P> + ArgminAdd<F, P> + ArgminDot<G, F> + ArgminMul<F, P> + ArgminMul<P, P> + ArgminMul<G, P> + ArgminL1Norm<F> + ArgminSignum + ArgminZeroLike + ArgminMinMax,
G: Clone + ArgminL2Norm<F> + ArgminSub<G, G> + ArgminAdd<G, G> + ArgminAdd<P, G> + ArgminDot<G, F> + ArgminDot<P, F> + ArgminMul<F, G> + ArgminMul<F, P> + ArgminZeroLike + ArgminMinMax,
L: Clone + LineSearch<P, F> + Solver<LineSearchProblem<O, P, G, F>, IterState<P, G, (), (), (), F>>,
F: ArgminFloat,
source§fn init(
&mut self,
problem: &mut Problem<O>,
state: IterState<P, G, (), (), (), F>,
) -> Result<(IterState<P, G, (), (), (), F>, Option<KV>), Error>
fn init( &mut self, problem: &mut Problem<O>, state: IterState<P, G, (), (), (), F>, ) -> Result<(IterState<P, G, (), (), (), F>, Option<KV>), Error>
source§fn next_iter(
&mut self,
problem: &mut Problem<O>,
state: IterState<P, G, (), (), (), F>,
) -> Result<(IterState<P, G, (), (), (), F>, Option<KV>), Error>
fn next_iter( &mut self, problem: &mut Problem<O>, state: IterState<P, G, (), (), (), F>, ) -> Result<(IterState<P, G, (), (), (), F>, Option<KV>), Error>
state
and optionally a KV
which holds key-value pairs used in
Observers.source§fn terminate(
&mut self,
state: &IterState<P, G, (), (), (), F>,
) -> TerminationStatus
fn terminate( &mut self, state: &IterState<P, G, (), (), (), F>, ) -> TerminationStatus
terminate_internal
. Read moresource§fn terminate_internal(&mut self, state: &I) -> TerminationStatus
fn terminate_internal(&mut self, state: &I) -> TerminationStatus
Auto Trait Implementations§
impl<L, P, G, F> Freeze for LBFGS<L, P, G, F>
impl<L, P, G, F> RefUnwindSafe for LBFGS<L, P, G, F>
impl<L, P, G, F> Send for LBFGS<L, P, G, F>
impl<L, P, G, F> Sync for LBFGS<L, P, G, F>
impl<L, P, G, F> Unpin for LBFGS<L, P, G, F>
impl<L, P, G, F> UnwindSafe for LBFGS<L, P, G, F>
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§unsafe fn clone_to_uninit(&self, dst: *mut T)
unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self
from the equivalent element of its
superset. Read more§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self
is actually part of its subset T
(and can be converted to it).§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset
but without any property checks. Always succeeds.§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self
to the equivalent element of its superset.