elsa solvers API ¶

Table of Contents

elsa solvers API

Solver ¶

template<typename data_t = real_t> class elsa::Solver : public elsa::Cloneable<Solver<real_t>>

Base class representing a solver for an optimization problem.

This class represents abstract (typically iterative) solvers acting on optimization problems.

Author

Matthias Wieczorek - initial code

Author

Maximilian Hornung - modularization

Author

Tobias Lasser - rewrite, modernization

Template Parameters

data_t: data type for the domain and range of the problem, defaulting to real_t

Subclassed by elsa::LinearizedADMM< data_t >

Public Types

using Scalar = data_t: Scalar alias.

Public Functions

~Solver() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) = 0

Solve the optimization problem (most likely iteratively)

Return

the current solution (after solving)

Parameters

[in] iterations: number of iterations to execute
[in] x0: optional initial solution, initial solution set to zero if not present

CGLS ¶

template<typename data_t = real_t> class elsa::CGLS : public elsa::Solver<real_t>

Conjugate Gradient for Least Squares Problems.

CGLS minimizes: $ \frac{1}{2} || Ax - b ||_2^2 + \eps^2 || x || $ where $A$ is an operator, it does not need to be square, symmetric, or positive definite. $b$ is the measured quantity, and $\eps$ a dampening factors.

If the dampening factor $\eps$ is non zero, the problem solves a Tikhonov problem.

CGLS is equivalent to apply CG to the normal equations $A^TAx = A^Tb$. However, it doesn not need to actually form the normal equation. Primarly, this improves the runtime performance, and it further improves the stability of the algorithm.

References:

https://web.stanford.edu/group/SOL/software/cgls/

Author: David Frank

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

CGLS(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> eps = 0, SelfType_t<data_t> tol = 1e-4)

Construct the necessary form of CGLS using some linear operator and the measured data.

Parameters

A: linear operator for the problem
b: the measured data
eps: the dampening factor
tol: stopping tolerance

CGLS(const CGLS<data_t>&) = delete: make copy constructor deletion explicit

~CGLS() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of CGLS.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute
[in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

CGLS<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic comparison operation

Proximal Gradient Descent ¶

template<typename data_t = real_t> class elsa::PGD : public elsa::Solver<real_t>

Proximal Gradient Descent (PGD)

PGD minimizes function of the form:

\[ \min_x g(x) + h(x) \]

where $g: \mathbb{R}^n \to \mathbb{R}$ is convex and differentiable, and $h: \mathbb{R}^n \to \mathbb{R} \cup \{-\infty, \infty\}$ is closed convex. Importantly $h$ needs not to be differentiable, but it needs an proximal operator. Usually, the proximal operator is assumed to be simple, and have an analytical solution.

This class currently implements the special case of $ g(x) = \frac{1}{2} ||A x - b||_2^2$. However, $h$ can be chosen freely.

Given $g$ defined as above and a convex set $\mathcal{C}$, one can define an constrained optimization problem:

\[ \min_{x \in \mathcal{C}} g(x) \]

Such constraints can take the form of, non-negativity or box constraints. This can be reformulated as an unconstrained problem:

\[ \min_{x} g(x) + \mathcal{I}_{\mathcal{C}}(x) \]

where $\mathcal{I}_{\mathcal{C}}(x)$ is the indicator function of the convex set $\mathcal{C}$, defined as:

\[ \mathcal{I}_{\mathcal{C}}(x) = \begin{cases} 0, & \text{if } x \in \mathcal{C} \\ \infty, & \text{if } x \notin \mathcal{C} \end{cases} \]

References:

Note

PGD has a worst-case complexity result of $ O(1/k) $.

Note

A special class of optimization is of the form:

\[ \min_{x} \frac{1}{2} || A x - b ||_2^2 + ||x||_1 \]

often referred to as $\ell_1$-Regularization. In this case, the proximal operator for the $\ell_1$-Regularization is the soft thresolding operator (ProximalL1). This can also be extended with constrains, such as non-negativity constraints.

See

An accerlerated version of proximal gradient descent is APGD.

Author

Andi Braimllari - initial code
David Frank - generalization

Template Parameters

data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

PGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters

A: the operator for the least squares data term
b: the measured data for the least squares data term
h: prox-friendly function
mu: the step length

PGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters

A: the operator for the least squares data term
b: the measured data for the least squares data term
W: the weights (usually counts / I0)
prox: the proximal operator for g
mu: the step length

PGD(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a given data fidelity term

Parameters

g: differentiable function
h: prox-friendly function
mu: the step length

PGD(const PGD<data_t>&) = delete: make copy constructor deletion explicit

~PGD() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of PGD.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute
[in] x0: optional initial solution, initial solution set to zero if not present

Protected Functions

auto cloneImpl() const -> PGD<data_t>* override: implement the polymorphic clone operation

auto isEqual(const Solver<data_t> &other) const -> bool override: implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> g_: differentiable function of problem formulation

std::unique_ptr<Functional<data_t>> h_: prox-friendly function of problem formulation

data_t mu_: the step size

data_t epsilon_: variable affecting the stopping condition

Accelerated Proximal Gradient Descent ¶

template<typename data_t = real_t> class elsa::APGD : public elsa::Solver<real_t>

Accelerated Proximal Gradient Descent (APGD)

APGD minimizes function of the the same for as PGD. See the documentation there.

This class represents a APGD solver with the following steps:

$ x_{k} = prox_h(y_k - \mu * A^T (Ay_k - b)) $
$ t_{k+1} = \frac{1 + \sqrt{1 + 4 * t_{k}^2}}{2} $
$ y_{k+1} = x_{k} + (\frac{t_{k} - 1}{t_{k+1}}) * (x_{k} - x_{k - 1}) $

APGD has a worst-case complexity result of $ O(1/k^2) $.

References: http://www.cs.cmu.edu/afs/cs/Web/People/airg/readings/2012_02_21_a_fast_iterative_shrinkage-thresholding.pdf https://arxiv.org/pdf/2008.02683.pdf

See

For a more detailed discussion of the type of problem for this solver, see PGD.

Author

Andi Braimllari - initial code David Frank - generalization to APGD

Template Parameters

data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

APGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters

A: the operator for the least squares data term
b: the measured data for the least squares data term
prox: the proximal operator for g
mu: the step length

APGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters

A: the operator for the least squares data term
b: the measured data for the least squares data term
W: the weights (usually counts / I0)
prox: the proximal operator for g
mu: the step length

APGD(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a given data fidelity term

Parameters

g: differentiable function of the LASSO problem
h: prox friendly functional of the LASSO problem
mu: the step length

~APGD() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of APGD.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute

Protected Functions

auto cloneImpl() const -> APGD<data_t>* override: implement the polymorphic clone operation

auto isEqual(const Solver<data_t> &other) const -> bool override: implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> g_: Differentiable part of the problem formulation.

std::unique_ptr<Functional<data_t>> h_: Prox-friendly part of the problem formulation.

data_t mu_: the step size

data_t epsilon_: variable affecting the stopping condition

Proximal Optimized Gradient Method ¶

template<typename data_t = real_t> class elsa::POGM : public elsa::Solver<real_t>

Proximal Optimized Gradient Method (POGM)

POGM minimizes function of the the same for as PGD. See the documentation there.

The update steps for POGM are:

\[ \begin{split} \theta_k &= \begin{cases}% \frac{1}{2}(1 + \sqrt{4 \theta_{k-1}^2 + 1}) & \text{ if } 2 \leq k < N \\% \frac{1}{2}(1 + \sqrt{8 \theta_{k-1}^2 + 1}) & \text{ if } k = N % \end{cases} \\% \gamma_k &= \frac{1}{L} \frac{2\theta_{k-1} + \theta_k - 1}{\theta_k} \\% \omega_k &= x_{k+1} - \frac{1}{L} \nabla f(x_{k-1}) \\% z_k &= \omega_k +% \underset{Nesterov}{\underbrace{\frac{\theta_{k-1} - 1}{\theta_k}(\omega_k + *\omega_{k-1})}}+% \underset{OGM}{\underbrace{\frac{\theta_{k-1}}{\theta_k}(\omega_k - *x_{k-1})}}+% \underset{POGM}{\underbrace{\frac{\theta_{k-1} - 1}{L *\gamma_{k-1}\theta_k}(z_{k-1} - x_{k-1})}}\\% x_k &= \operatorname{prox}_{\gamma_k g}(z_k) \end{split} \]

References:

https://arxiv.org/abs/1512.07516

See

For a more detailed discussion of the type of problem for this solver, see PGD.

Author

David Frank - Initial implementation

Template Parameters

data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

POGM(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct POGM with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters

A: the operator for the least squares data term
b: the measured data for the least squares data term
g: prox-friendly function
mu: the step length

POGM(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct POGM with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters

A: the operator for the least squares data term
b: the measured data for the least squares data term
W: the weights (usually counts / I0)
prox: the proximal operator for g
mu: the step length

POGM(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon()): Construct POGM with a given data fidelity term.

~POGM() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of POGM.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute

Protected Functions

auto cloneImpl() const -> POGM<data_t>* override: implement the polymorphic clone operation

auto isEqual(const Solver<data_t> &other) const -> bool override: implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> g_: The LASSO optimization problem.

data_t mu_: the step size

data_t epsilon_: variable affecting the stopping condition

Alternating Direction Method of Multipliers ¶

template<typename data_t = real_t> class elsa::ADMML2 : public elsa::Solver<real_t>

Class representing an Alternating Direction Method of Multipliers solver for a specific subset of constraints.

The general form of ADMM solves the following optimization problem:

\[ \min f(x) + g(z) \\ \text{s.t. } Ax + Bz = c \]

with $x \in \mathbb{R}^n$, $z \in \mathbb{R}^m$, $A \in \mathbb{R}^{p\times n}$, $B \in \mathbb{R}^{p\times m}$ and $c \in \mathbb{R}^p$

This specific version solves the problem of the form:

\[ \min \frac{1}{2} || Op x - b ||_2^2 + g(z) \\ \text{s.t. } Ax = z \]

with $B = Id$ and $c = 0$. Further: $f(x) = || Op x - b ||_2^2$.

This version of ADMM is useful, as the proximal operator is not known for the least squares functional, and this specifically implements and optimization of the first update step of ADMM. In this implementation, this is done via CGLS.

The update steps for ADMM are:

\[ x_{k+1} = \argmin_{x} \frac{1}{2}||Op x - b||_2^2 + \frac{1}{2\tau} ||Ax - z_k + u_k||_2^2 \\ z_{k+1} = \prox_{\tau g}(Ax_{k+1} + u_{k}) \\ u_{k+1} = u_k + Ax_{k+1} - z_{k+1} \]

This is further useful to solve problems such as TV, by setting the $A = \nabla$. And $ g = || \dot ||_1$

References:

Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, by Boyd et al.
Chapter 5.3 of “An introduction to continuous optimization for imaging”, by Chambolle and Pock

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

~ADMML2() override = default: default destructor

Protected Functions

ADMML2<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic equality operation

Private Members

data_t tau_ = {1}: $ \tau $ from the problem definition

Linearized ADMM ¶

template<class data_t> class elsa::LinearizedADMM : public elsa::Solver<data_t>

Class representing the linearized version of ADMM.

The general form of ADMM solves the following optimization problem:

\[ \min f(x) + g(z) \\ \text{s.t. } Ax + Bz = c \]

with $x \in \mathbb{R}^n$, $z \in \mathbb{R}^m$, $A \in \mathbb{R}^{p\times n}$, $B \in \mathbb{R}^{p\times m}$ and $c \in \mathbb{R}^p$

The linearized version assumes $A = K$ to be any linear operator, $B = - Id$ and $c = 0$. Hence, the optimization problem reduces to:

\[ \min f(x) + g(Ax) \]

The main benefit of this is, that in many applications both $f$ and $g$ can be simple functionals, for which proximal operators are easy to evaluate. I.e. The usual least square formulation without regularization is achieved with: $f(x) = 0$, and $g(z) = || z - b ||_2^2 $, with $ z = Kx$. For both the proximal operator is analytically known.

Many other problems can be converted to this form in a similar fashion by “stacking” the operator $K$. I.e. $L_1$ regularization can as: $f(x) = 0$, and $g(z) = || z_1 - b ||_2^2 + || z_2 ||_1$, with $ K = \begin{bmatrix} A \\ Id \end{bmatrix}$, and $z = \begin{bmatrix} z_1 \\ z_2 \end{bmatrix}$. Further, constraints can be added easily via the function $f$, by setting it to some indicator function (i.e. non-negativity or similar).

References:

Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, by Boyd et al.
Chapter 5.3 of “An introduction to continuous optimization for imaging”, by Chambolle and Pock

TODO:

Once implemented, take functionals which know their respective proximal instead of handing proximal operators directly

Public Functions

LinearizedADMM(const LinearOperator<data_t> &K, ProximalOperator<data_t> proxf, ProximalOperator<data_t> proxg, SelfType_t<data_t> sigma, std::optional<data_t> tau = std::nullopt, bool computeKNorm = true)

Construct the linearized version of ADMM given two proximal operators for $f$ and $g$ respectively, the potentially stacked operator $K$ and the two step length parameters $\sigma$ and $\tau$.

To ensure convergence it is checked if $0 < \tau < \frac{\sigma}{||K||_2^2}$. Here $||K||_2^2$ is the operator norm, which is approximated using a couple if iterations of the power method, to apprxoximate the largest eigenvalue of the operator. If computeKNorm is false, the computation is not performed, and it is assumed the above inequality holds.

~LinearizedADMM() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem (most likely iteratively)

Return

the current solution (after solving)

Parameters

[in] iterations: number of iterations to execute
[in] x0: optional initial solution, initial solution set to zero if not present

Private Members

std::unique_ptr<LinearOperator<data_t>> K_: The stacked linear operator $K$.

ProximalOperator<data_t> proxf_: The proximal operator for $f$.

ProximalOperator<data_t> proxg_: The proximal operator for $g$.

data_t sigma_: Step length $\sigma$.

data_t tau_: Step length $\tau$.

SQS Ordered Subsets ¶

template<typename data_t = real_t> class elsa::SQS : public elsa::Solver<real_t>

Class representing an SQS Solver.

This class implements an SQS solver with multiple options for momentum acceleration and ordered subsets.

No particular stopping rule is currently implemented (only a fixed number of iterations, default to 100).

Currently, this is limited to least square problems.

References: https://doi.org/10.1109/TMI.2014.2350962

Author

Michael Loipführer - initial code

Template Parameters

data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

SQS(const SQS<data_t>&) = delete: make copy constructor deletion explicit

~SQS() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute
[in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

SQS<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic comparison operation

Private Members

std::unique_ptr<LeastSquares<data_t>> fullProblem_: The optimization to solve TODO: This might be a nice variant?

data_t epsilon_: variable affecting the stopping condition

std::unique_ptr<LinearOperator<data_t>> preconditioner_ = {}: the preconditioner (if supplied)

bool momentumAcceleration_: whether to enable momentum acceleration

bool subsetMode_: whether to operate in subset based mode

Orthogonal Matching Pursuit ¶

template<typename data_t = real_t> class elsa::OrthogonalMatchingPursuit : public elsa::Solver<real_t>

Class representing the Orthogonal Matching Pursuit.

Orthogonal Matching Pursuit is a greedy algorithm to find a sparse representation. It starts with the 0-vector and adds one non-zero entry per iteration. The algorithm works in the following manner:

Find the next atom that should be used in the representation. This done by finding the atom that is correlated the most with the current residual.
Construct a dictionary that only contains the atoms that are being used, defined as $ D_S $ (dictionary restricted to the support).
The representation is the solution to the least square problem $ min_x \|y-D_S*x\| $

Author

Jonas Buerger - initial code

Template Parameters

data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

OrthogonalMatchingPursuit(const Dictionary<data_t> &D, const DataContainer<data_t> &y, data_t epsilon)

Constructor for OrthogonalMatchingPursuit, accepting a dictionary representation problem and, optionally, a value for epsilon.

Parameters

[in] D: dictionary operator
[in] y: signal that should be sparsely represented
[in] epsilon: affects the stopping condition

OrthogonalMatchingPursuit(const OrthogonalMatchingPursuit<data_t>&) = delete: make copy constructor deletion explicit

~OrthogonalMatchingPursuit() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the representation problem, i.e. apply iterations number of iterations of matching pursuit.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute. As OrthogonalMatchingPursuit is a greedy algorithm, this corresponds to the desired sparsity level
[in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

index_t mostCorrelatedAtom(const Dictionary<data_t> &dict, const DataContainer<data_t> &evaluatedResidual): helper method to find the index of the atom that is most correlated with the residual

OrthogonalMatchingPursuit<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic comparison operation

Private Members

data_t epsilon_: variable affecting the stopping condition

Landweber iteration ¶

Warning

doxygenclass: Cannot find class “elsa::LandweberIteration” in doxygen xml output for project “elsa” from directory: /var/lib/gitlab-runner/builds/y-Xocdgn/0/IP/elsa/build/docs/xml

template<typename data_t = real_t> class elsa::Landweber : public LandweberIteration<real_t>

Implementation of the Landweber iterations with both $ T $ and $ M $ being the identity matrix. This reduces the update rule to:

$ x_{k+1} = x_{k} + \lambda A^T (A(x_{k}) - b)$

This is basically a special case of the gradient descent.

Author: David Frank
See: LandweberIteration

Public Types

using Scalar = typename LandweberIteration<data_t>::Scalar: Scalar alias.

Public Functions

Landweber(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> stepSize)

Constructor for classical Landweber, accepting an operator, a measurement vector and a step size#.

Parameters

[in] A: linear operator to solve the problem with
[in] b: measurment vector of the problem
[in] stepSize: the fixed step size to be used while solving

Landweber(const LinearOperator<data_t> &A, const DataContainer<data_t> &b)

Constructor for classical Landweber, accepting an operator and a measurement vector.

Parameters

[in] A: linear operator to solve the problem with
[in] b: measurment vector of the problem

Landweber(const Landweber<data_t>&) = delete: make copy constructor deletion explicit

~Landweber() override = default: default destructor

Private Functions

Landweber<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic comparison operation

template<typename data_t = real_t> class elsa::SIRT : public LandweberIteration<real_t>

Implementation of the Simultaneous Iterative Reconstruction Technique (SIRT). For SIRT $ T = \text{diag}(\frac{1}{\text{row sum}}) = \text{diag}(\frac{1}{\sum_i A_{ij}})$ and $ M = \text{diag}( \frac{i}{\text{col sum}}) = \text{diag}(\frac{1}{\sum_j A_{ij}})$.

Outside of the Computed Tomography community, this algorithm is also often know as Simultaneous Algebraic Reconstruction Technique (SART).

Author: David Frank
See: LandweberIteration

Public Types

using Scalar = typename LandweberIteration<data_t>::Scalar: Scalar alias.

Public Functions

SIRT(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> stepSize)

Constructor for SIRT, accepting an operator, a measurement vector and a step size#.

Parameters

[in] A: linear operator to solve the problem with
[in] b: measurment vector of the problem
[in] stepSize: the fixed step size to be used while solving

SIRT(const LinearOperator<data_t> &A, const DataContainer<data_t> &b)

Constructor for SIRT, accepting an operator and a measurement vector.

Parameters

[in] A: linear operator to solve the problem with
[in] b: measurment vector of the problem

SIRT(const SIRT<data_t>&) = delete: make copy constructor deletion explicit

~SIRT() override = default: default destructor

Private Functions

SIRT<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic comparison operation

First-order optimization algorithms ¶

template<typename data_t = real_t> class elsa::GradientDescent : public elsa::Solver<real_t>

Class representing a simple gradient descent solver with a fixed, given step size.

This class implements a simple gradient descent iterative solver with a fixed, given step size. No particular stopping rule is currently implemented (only a fixed number of iterations, default to 100).

See

For a basic introduction and problem statement of first-order methods, see here

Author

Tobias Lasser - initial code

Template Parameters

data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

GradientDescent(const Functional<data_t> &problem, data_t stepSize)

Constructor for gradient descent, accepting a problem and a fixed step size.

Parameters

[in] problem: the problem that is supposed to be solved
[in] stepSize: the fixed step size to be used while solving

GradientDescent(const Functional<data_t> &problem)

Constructor for gradient descent, accepting a problem. The step size will be computed as $ 1 \over L $ with $ L $ being the Lipschitz constant of the function.

Parameters

[in] problem: the problem that is supposed to be solved

GradientDescent(const GradientDescent<data_t>&) = delete: make copy constructor deletion explicit

~GradientDescent() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute

Private Functions

GradientDescent<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> _problem: the differentiable optimizaion problem

MaybeUninitialized<data_t> _stepSize: the step size

template<typename data_t = real_t> class elsa::FGM : public elsa::Solver<real_t>

Class implementing Nesterov’s Fast Gradient Method.

This class implements Nesterov’s Fast Gradient Method. FGM is a first order method to efficiently optimize convex functions with Lipschitz-Continuous gradients.

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

FGM(const Functional<data_t> &problem, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for FGM, accepting an optimization problem and, optionally, a value for epsilon.

Parameters

[in] problem: the problem that is supposed to be solved
[in] epsilon: affects the stopping condition

FGM(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for FGM, accepting an optimization problem, the inverse of a preconditioner and, optionally, a value for epsilon.

Parameters

[in] problem: the problem that is supposed to be solved
[in] preconditionerInverse: the inverse of the preconditioner
[in] epsilon: affects the stopping condition

FGM(const FGM<data_t>&) = delete: make copy constructor deletion explicit

~FGM() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute
[in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

FGM<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> _problem: the differentiable optimizaion problem

data_t _epsilon: variable affecting the stopping condition

std::unique_ptr<LinearOperator<data_t>> _preconditionerInverse = {}: the inverse of the preconditioner (if supplied)

template<typename data_t = real_t> class elsa::OGM : public elsa::Solver<real_t>

Class representing the Optimized Gradient Method.

This class implements the Optimized Gradient Method as introduced by Kim and Fessler in 2016. OGM is a first order method to efficiently optimize convex functions with Lipschitz-Continuous gradients. It can be seen as an improvement on Nesterov’s Fast Gradient Method.

Public Types

using Scalar = typename Solver<data_t>::Scalar: Scalar alias.

Public Functions

OGM(const Functional<data_t> &problem, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for OGM, accepting an optimization problem and, optionally, a value for epsilon.

Parameters

[in] problem: the problem that is supposed to be solved
[in] epsilon: affects the stopping condition

OGM(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for OGM, accepting an optimization problem the inverse of a preconditioner and, optionally, a value for epsilon.

Parameters

[in] problem: the problem that is supposed to be solved
[in] preconditionerInverse: the inverse of the preconditioner
[in] epsilon: affects the stopping condition

OGM(const OGM<data_t>&) = delete: make copy constructor deletion explicit

~OGM() override = default: default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.

Return

the approximated solution

Parameters

[in] iterations: number of iterations to execute
[in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

OGM<data_t> *cloneImpl() const override: implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override: implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> _problem: the differentiable optimizaion problem

data_t _epsilon: variable affecting the stopping condition

std::unique_ptr<LinearOperator<data_t>> _preconditionerInverse = {}: the inverse of the preconditioner (if supplied)