# Struct argmin::solver::quasinewton::LBFGS

source · `pub struct LBFGS<L, P, G, F> { /* private fields */ }`

## Expand description

## Limited-memory BFGS (L-BFGS) method

L-BFGS is an approximation to BFGS which requires a limited amount of memory. Instead of storing the inverse, only a few vectors which implicitly represent the inverse matrix are stored.

It requires a line search and the number of vectors to be stored (history size `m`

) must be
set. Additionally an initial guess for the parameter vector is required, which is to be
provided via the `configure`

method of the
`Executor`

(See `IterState`

, in particular `IterState::param`

).
In the same way the initial gradient and cost function corresponding to the initial parameter
vector can be provided. If these are not provided, they will be computed during initialization
of the algorithm.

Two tolerances can be configured, which are both needed in the stopping criteria.
One is a tolerance on the gradient (set with
`with_tolerance_grad`

): If the norm of the gradient is below
said tolerance, the algorithm stops. It defaults to `sqrt(EPSILON)`

.
The other one is a tolerance on the change of the cost function from one iteration to the
other. If the change is below this tolerance (default: `EPSILON`

), the algorithm stops. This
parameter can be set via `with_tolerance_cost`

.

### Orthant-Wise Limited-memory Quasi-Newton (OWL-QN) method

OWL-QN is a method that adapts L-BFGS to L1-regularization. The original L-BFGS requires a
loss function to be differentiable and does not support L1-regularization. Therefore,
this library switches to OWL-QN when L1-regularization is specified. L1-regularization can be
performed via `with_l1_regularization`

.

TODO: Implement compact representation of BFGS updating (Nocedal/Wright p.230)

### Requirements on the optimization problem

The optimization problem is required to implement `CostFunction`

and `Gradient`

.

### Reference

Jorge Nocedal and Stephen J. Wright (2006). Numerical Optimization. Springer. ISBN 0-387-30303-0.

Galen Andrew and Jianfeng Gao (2007). Scalable Training of L1-Regularized Log-Linear Models, International Conference on Machine Learning.

## Implementations§

source§### impl<L, P, G, F> LBFGS<L, P, G, F>where
F: ArgminFloat,

### impl<L, P, G, F> LBFGS<L, P, G, F>where
F: ArgminFloat,

source#### pub fn with_tolerance_grad(self, tol_grad: F) -> Result<Self, Error>

#### pub fn with_tolerance_grad(self, tol_grad: F) -> Result<Self, Error>

The algorithm stops if the norm of the gradient is below `tol_grad`

.

The provided value must be non-negative. Defaults to `sqrt(EPSILON)`

.

##### Example

`let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_tolerance_grad(1e-6)?;`

source#### pub fn with_tolerance_cost(self, tol_cost: F) -> Result<Self, Error>

#### pub fn with_tolerance_cost(self, tol_cost: F) -> Result<Self, Error>

Sets tolerance for the stopping criterion based on the change of the cost stopping criterion

The provided value must be non-negative. Defaults to `EPSILON`

.

##### Example

`let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_tolerance_cost(1e-6)?;`

source#### pub fn with_l1_regularization(self, l1_coeff: F) -> Result<Self, Error>

#### pub fn with_l1_regularization(self, l1_coeff: F) -> Result<Self, Error>

Activates L1-regularization with coefficient `l1_coeff`

.

Parameter `l1_coeff`

must be `> 0.0`

.

##### Example

`let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_l1_regularization(1.0)?;`

## Trait Implementations§

source§### impl<'de, L, P, G, F> Deserialize<'de> for LBFGS<L, P, G, F>

### impl<'de, L, P, G, F> Deserialize<'de> for LBFGS<L, P, G, F>

source§#### fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,

#### fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,

source§### impl<O, L, P, G, F> Solver<O, IterState<P, G, (), (), (), F>> for LBFGS<L, P, G, F>where
O: CostFunction<Param = P, Output = F> + Gradient<Param = P, Gradient = G>,
P: Clone + ArgminSub<P, P> + ArgminSub<F, P> + ArgminAdd<P, P> + ArgminAdd<F, P> + ArgminDot<G, F> + ArgminMul<F, P> + ArgminMul<P, P> + ArgminMul<G, P> + ArgminL1Norm<F> + ArgminSignum + ArgminZeroLike + ArgminMinMax,
G: Clone + ArgminL2Norm<F> + ArgminSub<G, G> + ArgminAdd<G, G> + ArgminAdd<P, G> + ArgminDot<G, F> + ArgminDot<P, F> + ArgminMul<F, G> + ArgminMul<F, P> + ArgminZeroLike + ArgminMinMax,
L: Clone + LineSearch<P, F> + Solver<LineSearchProblem<O, P, G, F>, IterState<P, G, (), (), (), F>>,
F: ArgminFloat,

### impl<O, L, P, G, F> Solver<O, IterState<P, G, (), (), (), F>> for LBFGS<L, P, G, F>where
O: CostFunction<Param = P, Output = F> + Gradient<Param = P, Gradient = G>,
P: Clone + ArgminSub<P, P> + ArgminSub<F, P> + ArgminAdd<P, P> + ArgminAdd<F, P> + ArgminDot<G, F> + ArgminMul<F, P> + ArgminMul<P, P> + ArgminMul<G, P> + ArgminL1Norm<F> + ArgminSignum + ArgminZeroLike + ArgminMinMax,
G: Clone + ArgminL2Norm<F> + ArgminSub<G, G> + ArgminAdd<G, G> + ArgminAdd<P, G> + ArgminDot<G, F> + ArgminDot<P, F> + ArgminMul<F, G> + ArgminMul<F, P> + ArgminZeroLike + ArgminMinMax,
L: Clone + LineSearch<P, F> + Solver<LineSearchProblem<O, P, G, F>, IterState<P, G, (), (), (), F>>,
F: ArgminFloat,

source§#### fn init(
&mut self,
problem: &mut Problem<O>,
state: IterState<P, G, (), (), (), F>
) -> Result<(IterState<P, G, (), (), (), F>, Option<KV>), Error>

#### fn init( &mut self, problem: &mut Problem<O>, state: IterState<P, G, (), (), (), F> ) -> Result<(IterState<P, G, (), (), (), F>, Option<KV>), Error>

source§#### fn next_iter(
&mut self,
problem: &mut Problem<O>,
state: IterState<P, G, (), (), (), F>
) -> Result<(IterState<P, G, (), (), (), F>, Option<KV>), Error>

#### fn next_iter( &mut self, problem: &mut Problem<O>, state: IterState<P, G, (), (), (), F> ) -> Result<(IterState<P, G, (), (), (), F>, Option<KV>), Error>

`state`

and optionally a `KV`

which holds key-value pairs used in
Observers.source§#### fn terminate(
&mut self,
state: &IterState<P, G, (), (), (), F>
) -> TerminationStatus

#### fn terminate( &mut self, state: &IterState<P, G, (), (), (), F> ) -> TerminationStatus

`terminate_internal`

. Read moresource§#### fn terminate_internal(&mut self, state: &I) -> TerminationStatus

#### fn terminate_internal(&mut self, state: &I) -> TerminationStatus

## Auto Trait Implementations§

### impl<L, P, G, F> RefUnwindSafe for LBFGS<L, P, G, F>

### impl<L, P, G, F> Send for LBFGS<L, P, G, F>

### impl<L, P, G, F> Sync for LBFGS<L, P, G, F>

### impl<L, P, G, F> Unpin for LBFGS<L, P, G, F>

### impl<L, P, G, F> UnwindSafe for LBFGS<L, P, G, F>

## Blanket Implementations§

source§### impl<T> BorrowMut<T> for Twhere
T: ?Sized,

### impl<T> BorrowMut<T> for Twhere
T: ?Sized,

source§#### fn borrow_mut(&mut self) -> &mut T

#### fn borrow_mut(&mut self) -> &mut T

§### impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,

### impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,

§#### fn to_subset(&self) -> Option<SS>

#### fn to_subset(&self) -> Option<SS>

`self`

from the equivalent element of its
superset. Read more§#### fn is_in_subset(&self) -> bool

#### fn is_in_subset(&self) -> bool

`self`

is actually part of its subset `T`

(and can be converted to it).§#### fn to_subset_unchecked(&self) -> SS

#### fn to_subset_unchecked(&self) -> SS

`self.to_subset`

but without any property checks. Always succeeds.§#### fn from_subset(element: &SS) -> SP

#### fn from_subset(element: &SS) -> SP

`self`

to the equivalent element of its superset.