elsa solvers¶
Table of Contents
Solver Interface¶
-
template<typename
data_t
= real_t>
classelsa
::
Solver
: public elsa::Cloneable<Solver<real_t>>¶ Base class representing a solver for an optimization problem.
This class represents abstract (typically iterative) solvers acting on optimization problems.
- Author
Matthias Wieczorek - initial code
Maximilian Hornung - modularization
Tobias Lasser - rewrite, modernization
- Template Parameters
data_t
: data type for the domain and range of the problem, defaulting to real_t
Subclassed by elsa::LinearizedADMM< data_t >, elsa::PDHG< data_t >
Public Functions
-
~Solver
() override = default¶ default destructor
-
DataContainer<data_t>
solve
(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt)¶ Solve the optimization problem from a zero starting point. This will setup the iterative algorithm and then run the algorithm for the specified number of iterations (assuming the algorithm doesn’t convergence before).
- Return
the current solution (after solving)
- Parameters
[in] iterations
: number of iterations to execute[in] x0
: optional initial solution, initial solution set to zero if not present
-
DataContainer<data_t>
setup
(std::optional<DataContainer<data_t>> x)¶ Setup and reset the solver. This function should set all values, vectors and temporaries necessary to run the iterative reconstruction algorithm.
- Return
initial solution
- Parameters
[in] x
: optional to initial solution, if optional is empty, some defaulted (e.g. 0 filled) initial solution should be returned
-
DataContainer<data_t>
step
(DataContainer<data_t> x)¶ Perform a single step of the iterative reconstruction algorithm.
- Return
the updated solution estimate
- Parameters
[in] x
: the current solution
-
DataContainer<data_t>
run
(index_t iterations, const DataContainer<data_t> &x0, bool show = false)¶ Run the iterative reconstruction algorithm for the given number of iterations on the initial starting point. This fun.
- Return
the current solution estimate
- Parameters
[in] iterations
: the number of iterations to execute[in] x0
: the initial starting point of the run[in] show
: if the algorithm should print
-
bool
shouldStop
() const¶ Function to determine when to stop. This function should implement algorithm specific stopping criterions.
- Return
true if the algorithm should stop
-
std::string
formatHeader
() const¶ Format the header string for the iterative reconstruction algorithm.
-
void
printHeader
() const¶ Print the header of the iterative reconstruction algorithm.
-
std::string
formatStep
(const DataContainer<data_t> &x) const¶ Format the step string for the iterative reconstruction algorithm.
-
void
printStep
(const DataContainer<data_t> &x, index_t curiter, std::chrono::duration<double> steptime, std::chrono::duration<double> elapsed) const¶ Print a step of the iterative reconstruction algorithm.
-
void
setCallback
(const std::function<void(const DataContainer<data_t>&)> &callback)¶ Set the callback function.
An example usage to later plot the loss (assuming some ground truth data):
auto phantom = getSomePhantom(); auto solver = SomeSolver(...); // compute mean square error for each iteration std::vector<float> msre; solver.setCallback([&msre](const DataContainer<data_t>& x){ msre.push_back(square(phantom - x).sum()); });
Similarly from Python:
phantom = getSomePhantom() solver = SomeSolver(...) // compute mean square error for each iteration msre = [] solver.setCallback(lambda x: msre.append(square(phantom - x).sum()))
Protected Attributes
-
bool
configured_
= false¶ Is the solver already configured (no need to call
setup
)
-
std::string
name_
= "Solver"¶ Logging name of the iterative reconstruction algorithm.
Private Members
-
std::function<void(const DataContainer<data_t>&)>
callback_
= [](auto) {}¶ Callback function to call each iteration, by default, do nothing.
Iterative¶
Iterative solvers to solve $A x = b$.
CGLS¶
-
template<typename
data_t
= real_t>
classelsa
::
CGLS
: public elsa::Solver<real_t>¶ Conjugate Gradient for Least Squares Problems.
CGLS minimizes: $ \frac{1}{2} || Ax - b ||_2^2 + \eps^2 || x || $ where $A$ is an operator, it does not need to be square, symmetric, or positive definite. $b$ is the measured quantity, and $\eps$ a dampening factors.
If the dampening factor $\eps$ is non zero, the problem solves a Tikhonov problem.
CGLS is equivalent to apply CG to the normal equations $A^TAx = A^Tb$. However, it doesn not need to actually form the normal equation. Primarily, this improves the runtime performance, and it further improves the stability of the algorithm.
References:
- Author
David Frank
Public Functions
-
CGLS
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> eps = 0, SelfType_t<data_t> tol = 1e-4)¶ Construct the necessary form of CGLS using some linear operator and the measured data.
- Parameters
A
: linear operator for the problemb
: the measured dataeps
: the dampening factortol
: stopping tolerance
-
~CGLS
() override = default¶ default destructor
CGNE¶
-
template<typename
data_t
= real_t>
classelsa
::
CGNE
: public elsa::Solver<real_t>¶ Conjugate Gradient via the Normal equation.
CG solves the system of equations: [ A x = b ] where $A$ is symmetric positive definite operator. $b$ is the measured quantity.
In our implementation, we always assume $A$ is non-symmetric and not positive definite. Hence, we compute the solution to the normal equation [ A^T A x = A^t b ]
References:
An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, by Shewchuk
- Author
David Frank
Public Functions
-
CGNE
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> tol = 1e-4)¶ Construct the necessary form of CGNE using some linear operator and the measured data.
- Parameters
A
: linear operator for the problemb
: the measured datatol
: stopping tolerance
-
~CGNE
() override = default¶ default destructor
Landweber iteration¶
Warning
doxygenclass: Cannot find class “elsa::LandweberIteration” in doxygen xml output for project “elsa” from directory: /var/lib/gitlab-runner/builds/RFzX5nBc_/0/tum-ciip/elsa/build/docs/xml
Landweber¶
-
template<typename
data_t
= real_t>
classelsa
::
Landweber
: public LandweberIteration<real_t>¶ Implementation of the Landweber iterations with both $ T $ and $ M $ being the identity matrix. This reduces the update rule to:
$ x_{k+1} = x_{k} + \lambda A^T (A(x_{k}) - b)$
This is basically a special case of the gradient descent.
- Author
David Frank
- See
LandweberIteration
Public Functions
-
Landweber
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> stepSize)¶ Constructor for classical Landweber, accepting an operator, a measurement vector and a step size.
- Parameters
[in] A
: linear operator to solve the problem with[in] b
: measurement vector of the problem[in] stepSize
: the fixed step size to be used while solving
-
Landweber
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b)¶ Constructor for classical Landweber, accepting an operator and a measurement vector.
- Parameters
[in] A
: linear operator to solve the problem with[in] b
: measurment vector of the problem
-
~Landweber
() override = default¶ default destructor
SIRT¶
-
template<typename
data_t
= real_t>
classelsa
::
SIRT
: public LandweberIteration<real_t>¶ Implementation of the Simultaneous Iterative Reconstruction Technique (SIRT). For SIRT $ T = \text{diag}(\frac{1}{\text{row sum}}) = \text{diag}(\frac{1}{\sum_i A_{ij}})$ and $ M = \text{diag}( \frac{i}{\text{col sum}}) = \text{diag}(\frac{1}{\sum_j A_{ij}})$.
Outside of the Computed Tomography community, this algorithm is also often know as Simultaneous Algebraic Reconstruction Technique (SART).
- Author
David Frank
- See
LandweberIteration
Public Functions
-
SIRT
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> stepSize)¶ Constructor for SIRT, accepting an operator, a measurement vector and a step size#.
- Parameters
[in] A
: linear operator to solve the problem with[in] b
: measurment vector of the problem[in] stepSize
: the fixed step size to be used while solving
-
SIRT
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b)¶ Constructor for SIRT, accepting an operator and a measurement vector.
- Parameters
[in] A
: linear operator to solve the problem with[in] b
: measurement vector of the problem
-
~SIRT
() override = default¶ default destructor
GMRES¶
AB_GMRES¶
-
template<typename
data_t
= real_t>
classelsa
::
AB_GMRES
: public elsa::Solver<real_t>¶ Class implementing the AB_GMRES method for solving with an unmatched Back Projector.
GMRES is an iterative method for solving an indefinite nonsymmetric linear system $ Ax = b $ It solves the system by minimizing a least squares Problem for $ J(y) = \left\| \beta {e_{1}} - \bar{H}y \right\| $ where $ \beta $ is $ \left\| f - Ax \right\| $, $ {e_{1}} $ is the first column of the $ (k + 1) \times (k + 1) $ identity matrix and $ \bar{H} $ is $ (k + 1) \times (k) $ matrix whose only nonzero entries are generated by the algorithm.
We implement an expanded AB_GMRES method for solving large sparse least square problems. Here we define $ A^T = B $ for our matrix $ A \epsilon \mathbb{R}^{m \times n} $ with $ B \epsilon \mathbb{R}^{n \times m} $. We solve the so-called unmatched normal equations $ BAx = Bb $ and $ ABy = b $ where $ x = By $:
AB_GMRES solves $ min_{y \epsilon \mathbb{R}^m} \left\| b - ABy \right\|$, $ x = By $.
$ A $ can be imagined as a discretization of the forward projector, while $ B $ represents an matched or unmatched back projector or preconditioner/backprojector, the algorithm can reach semi-convergence when $ A^T \neq B $ as long as valid models for the projector / backprojector pair are chosen
Convergence is considered reached when $ ||r|| <= epsilon $
References: http://epubs.siam.org/doi/10.1137/0907058 http://epubs.siam.org/doi/10.1137/070696313 http://arxiv.org/abs/2110.01481
- Author
Daniel Klitzner - initial code
Public Functions
-
AB_GMRES
(const LinearOperator<data_t> &projector, const DataContainer<data_t> &sinogram, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for AB_GMRES, accepting a forward projector A, *a sinogram b and an optional value for epsilon, the backprojector B will be the adjoint *of projector A.
- Parameters
[in] projector
: Discretization of a forward projector[in] sinogram
: sinogram of measured Data when applying phantom to projector[in] epsilon
: affects the stopping condition
-
AB_GMRES
(const LinearOperator<data_t> &projector, const LinearOperator<data_t> &backprojector, const DataContainer<data_t> &sinogram, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for AB_GMRES, accepting a forward projector A, an *unmatched backprojector B, a sinogram b and an optional value for *epsilon.
- Parameters
[in] projector
: Discretization of a forward projector A[in] backprojector
: An optional unmatched/matched backprojector B[in] sinogram
: sinogram “b” of measured Data when applying phantom to projector[in] epsilon
: affects the stopping condition
-
~AB_GMRES
() override = default¶ default destructor
-
DataContainer<data_t>
solveAndRestart
(index_t iterations, index_t restarts, std::optional<DataContainer<data_t>> x0 = std::nullopt)¶ Solves the given Linear System using AB-GMRES as default, i.e. apply i iterations with r restarts of AB-GMRES.
- Return
an approximated solution
- Parameters
[in] iterations
: number of iterations to execute[in] restarts
: number of restarts to execute after iterations[in] x0
: optional approximate starting positions (used for preconditioning)
-
DataContainer<data_t>
solve
(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override¶ Solves the given Linear System using AB-GMRES as default, i.e. apply i iterations of AB-GMRES.
- Return
an approximated solution
- Parameters
[in] iterations
: number of iterations to execute[in] x0
: optional approximate starting positions (used for preconditioning)
Private Functions
Private Members
-
std::unique_ptr<LinearOperator<data_t>>
_A
¶ Projector A.
-
std::unique_ptr<LinearOperator<data_t>>
_B
¶ Unmatched Backprojector B, used for Preconditioning.
-
DataContainer<data_t>
_b
¶ sinogram b
BA_GMRES¶
-
template<typename
data_t
= real_t>
classelsa
::
BA_GMRES
: public elsa::Solver<real_t>¶ Class implementing the BA_GMRES method for solving with an unmatched Back Projector.
GMRES is an iterative method for solving an indefinite nonsymmetric linear system $ Ax = b $ It solves the system by minimizing a least squares Problem for $ J(y) = \left\| \beta {e_{1}} - \bar{H}y \right\| $ where $ \beta $ is $ \left\| f - Ax \right\| $, $ {e_{1}} $ is the first column of the $ (k + 1) \times (k + 1) $ identity matrix and $ \bar{H} $ is $ (k + 1) \times (k) $ matrix whose only nonzero entries are generated by the algorithm.
We implement an expanded BA_GMRES method for solving large sparse least square problems. Here we define $ A^T = B $ for our matrix $ A \epsilon \mathbb{R}^{m \times n} $ with $ B \epsilon \mathbb{R}^{n \times m} $. We solve the so-called unmatched normal equations $ BAx = Bb $ and $ ABy = b $ where $ x = By $:
BA_GMRES solves $ min_{x \epsilon \mathbb{R}^m} \left\| Bb - BAx \right\|$.
$ A $ can be imagined as a discretization of the forward projector, while $ B $ represents an matched or unmatched back projector or preconditioner/backprojector, the algorithm can reach semi-convergence when $ A^T \neq B $ as long as valid models for the projector / backprojector pair are chosen
Convergence is considered reached when $ ||r|| <= epsilon $
References: http://epubs.siam.org/doi/10.1137/0907058 http://epubs.siam.org/doi/10.1137/070696313 http://arxiv.org/abs/2110.01481
- Author
Daniel Klitzner - initial code
Public Functions
-
BA_GMRES
(const LinearOperator<data_t> &projector, const DataContainer<data_t> &sinogram, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for BA_GMRES, accepting a forward projector A, *a sinogram b and an optional value for epsilon, the backprojector B will be the adjoint *of projector A.
- Parameters
[in] projector
: Discretization of a forward projector[in] sinogram
: sinogram of measured Data when applying phantom to projector[in] epsilon
: affects the stopping condition
-
BA_GMRES
(const LinearOperator<data_t> &projector, const LinearOperator<data_t> &backprojector, const DataContainer<data_t> &sinogram, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for BA_GMRES, accepting a forward projector A, an *unmatched backprojector B, a sinogram b and an optional value for *epsilon.
- Parameters
[in] projector
: Discretization of a forward projector A[in] backprojector
: An optional unmatched/matched backprojector B[in] sinogram
: sinogram “b” of measured Data when applying phantom to projector[in] epsilon
: affects the stopping condition
-
~BA_GMRES
() override = default¶ default destructor
-
DataContainer<data_t>
solveAndRestart
(index_t iterations, index_t restarts, std::optional<DataContainer<data_t>> x0 = std::nullopt)¶ Solves the given Linear System using BA-GMRES as default, i.e. apply i iterations with r restarts of BA-GMRES.
- Return
an approximated solution
- Parameters
[in] iterations
: number of iterations to execute[in] restarts
: number of restarts to execute after iterations[in] x0
: optional approximate starting positions (used for preconditioning)
-
DataContainer<data_t>
solve
(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override¶ Solves the given Linear System using BA-GMRES as default, i.e. apply i iterations of BA-GMRES.
- Return
an approximated solution
- Parameters
[in] iterations
: number of iterations to execute[in] x0
: optional approximate starting positions (used for preconditioning)
Private Functions
Private Members
-
std::unique_ptr<LinearOperator<data_t>>
_A
¶ Projector A.
-
std::unique_ptr<LinearOperator<data_t>>
_B
¶ Unmatched Backprojector B, used for Preconditioning.
-
DataContainer<data_t>
_b
¶ sinogram b
Smooth¶
Optimization algorithms for smooth problem formulations.
First-order optimization algorithms¶
First-order algorithms solve problems of the form
with two assumptions:
$f: \mathbb{R}^d \to \mathbb{R}$ is a convex continuously differentiable function with Lipschitz continuous gradient, i.e. $f \in C_{L}^{1, 1}(\mathbb{R}^d)$ (with $L > 0$ is the Lipschitz constant)
The problem is solvable, i.e. there exists an optimal $x^{*}$
Intuition for Momentum¶
A nice analogy, is a ball in hilly terrain. The ball is at a random position, with zero initial velocity. The algorithm determines the gradient of potential energy, which is the force acting on the ball. Which in our case, is exactly the (negative) gradient of f$ff$. Then the algorithm updates the velocity, which in turn updates the position of the ball. Compared to a vanilla gradient descent, where the position is directly integrated instead of the velocity.
Phrased differently, the velocity is a look ahead position, from where the gradient of the current solution is applied to. Nesterov’s algorithm improves on that, by computing the gradient at the look ahead position, instead of at the current solutions position.
GradientDescent¶
-
template<typename
data_t
= real_t>
classelsa
::
GradientDescent
: public elsa::Solver<real_t>¶ Class representing a simple gradient descent solver with a fixed, given step size.
This class implements a simple gradient descent iterative solver with a fixed, given step size. No particular stopping rule is currently implemented (only a fixed number of iterations, default to 100).
- Author
Tobias Lasser - initial code
- Template Parameters
data_t
: data type for the domain and range of the problem, defaulting to real_t
Public Functions
-
GradientDescent
(const Functional<data_t> &problem, data_t stepSize)¶ Constructor for gradient descent, accepting a problem and a fixed step size.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] stepSize
: the fixed step size to be used while solving
-
GradientDescent
(const Functional<data_t> &problem)¶ Constructor for gradient descent, accepting a problem. The step size will be computed as $ 1 \over L $ with $ L $ being the Lipschitz constant of the function.
- Parameters
[in] problem
: the problem that is supposed to be solved
-
GradientDescent
(const Functional<data_t> &problem, const LineSearchMethod<data_t> &lineSearchMethod)¶ Constructor for gradient descent, accepting a problem and a line search method.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] lineSearchMethod
: the line search method to find the step size at each iteration
-
GradientDescent
(const GradientDescent<data_t>&) = delete¶ make copy constructor deletion explicit
-
~GradientDescent
() override = default¶ default destructor
-
DataContainer<data_t>
solve
(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override¶ Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.
- Return
the approximated solution
- Parameters
[in] iterations
: number of iterations to execute
Private Functions
-
GradientDescent<data_t> *
cloneImpl
() const override¶ implement the polymorphic clone operation
Private Members
-
std::unique_ptr<Functional<data_t>>
_problem
¶ the differentiable optimizaion problem
Nesterov’s Fast Gradient Method¶
-
template<typename
data_t
= real_t>
classelsa
::
FGM
: public elsa::Solver<real_t>¶ Class implementing Nesterov’s Fast Gradient Method.
This class implements Nesterov’s Fast Gradient Method. FGM is a first order method to efficiently optimize convex functions with Lipschitz-Continuous gradients.
Public Functions
-
FGM
(const Functional<data_t> &problem, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for FGM, accepting an optimization problem and, optionally, a value for epsilon.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] epsilon
: affects the stopping condition
-
FGM
(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for FGM, accepting an optimization problem, the inverse of a preconditioner and, optionally, a value for epsilon.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] preconditionerInverse
: the inverse of the preconditioner[in] epsilon
: affects the stopping condition
-
FGM
(const Functional<data_t> &problem, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for FGM, accepting an optimization problem and, optionally, a value for epsilon.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] lineSearchMethod
: the line search method to find the step size at each iteration[in] epsilon
: affects the stopping condition
-
FGM
(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for FGM, accepting an optimization problem, the inverse of a preconditioner and, optionally, a value for epsilon.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] preconditionerInverse
: the inverse of the preconditioner[in] lineSearchMethod
: the line search method to find the step size at each iteration[in] epsilon
: affects the stopping condition
-
~FGM
() override = default¶ default destructor
Private Functions
Private Members
-
std::unique_ptr<Functional<data_t>>
_problem
¶ the differentiable optimizaion problem
-
std::unique_ptr<LinearOperator<data_t>>
_preconditionerInverse
= {}¶ the inverse of the preconditioner (if supplied)
-
Optimized Gradient Method¶
-
template<typename
data_t
= real_t>
classelsa
::
OGM
: public elsa::Solver<real_t>¶ Class representing the Optimized Gradient Method.
This class implements the Optimized Gradient Method as introduced by Kim and Fessler in 2016. OGM is a first order method to efficiently optimize convex functions with Lipschitz-Continuous gradients. It can be seen as an improvement on Nesterov’s Fast Gradient Method.
Public Functions
-
OGM
(const Functional<data_t> &problem, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for OGM, accepting an optimization problem and, optionally, a value for epsilon.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] epsilon
: affects the stopping condition
-
OGM
(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for OGM, accepting an optimization problem the inverse of a preconditioner and, optionally, a value for epsilon.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] preconditionerInverse
: the inverse of the preconditioner[in] epsilon
: affects the stopping condition
-
OGM
(const Functional<data_t> &problem, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for OGM, accepting an optimization problem and, optionally, a value for epsilon.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] lineSearchMethod
: the line search method to find the step size at each iteration[in] epsilon
: affects the stopping condition
-
OGM
(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Constructor for OGM, accepting an optimization problem the inverse of a preconditioner and, optionally, a value for epsilon.
- Parameters
[in] problem
: the problem that is supposed to be solved[in] preconditionerInverse
: the inverse of the preconditioner[in] lineSearchMethod
: the line search method to find the step size at each iteration[in] epsilon
: affects the stopping condition
-
~OGM
() override = default¶ default destructor
Private Functions
Private Members
-
std::unique_ptr<Functional<data_t>>
_problem
¶ the differentiable optimizaion problem
-
std::unique_ptr<LinearOperator<data_t>>
_preconditionerInverse
= {}¶ the inverse of the preconditioner (if supplied)
-
SQS Ordered Subsets¶
-
template<typename
data_t
= real_t>
classelsa
::
SQS
: public elsa::Solver<real_t>¶ Class representing an SQS Solver.
This class implements an SQS solver with multiple options for momentum acceleration and ordered subsets.
No particular stopping rule is currently implemented (only a fixed number of iterations, default to 100).
Currently, this is limited to least square problems.
References: https://doi.org/10.1109/TMI.2014.2350962
- Author
Michael Loipführer - initial code
- Template Parameters
data_t
: data type for the domain and range of the problem, defaulting to real_t
Public Functions
-
~SQS
() override = default¶ default destructor
-
DataContainer<data_t>
solve
(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override¶ Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.
- Return
the approximated solution
- Parameters
[in] iterations
: number of iterations to execute[in] x0
: optional initial solution, initial solution set to zero if not present
Private Functions
Private Members
-
std::unique_ptr<LeastSquares<data_t>>
fullProblem_
¶ The optimization to solve TODO: This might be a nice variant?
-
std::unique_ptr<LinearOperator<data_t>>
preconditioner_
= {}¶ the preconditioner (if supplied)
-
bool
momentumAcceleration_
¶ whether to enable momentum acceleration
-
bool
subsetMode_
¶ whether to operate in subset based mode
Non-Smooth¶
Optimization algorithms for non-smooth problem formulations. These problem formulations usually contain at least one non-differentiable term, such as the L1-Norm.
Proximal Gradient Methods¶
Proximal gradient methods solves problems of the form
where $g$ is a smooth function, and $h$ is simple (meaning the proximal operator is easy to compute).
Proximal Gradient Descent¶
-
template<typename
data_t
= real_t>
classelsa
::
PGD
: public elsa::Solver<real_t>¶ Proximal Gradient Descent (PGD)
PGD minimizes function of the form:
\[ \min_x g(x) + h(x) \]where $g: \mathbb{R}^n \to \mathbb{R}$ is convex and differentiable, and $h: \mathbb{R}^n \to \mathbb{R} \cup \{-\infty, \infty\}$ is closed convex. Importantly $h$ needs not to be differentiable, but it needs an proximal operator. Usually, the proximal operator is assumed to be simple, and have an analytical solution.
This class currently implements the special case of $ g(x) = \frac{1}{2} ||A x - b||_2^2$. However, $h$ can be chosen freely.
Given $g$ defined as above and a convex set $\mathcal{C}$, one can define an constrained optimization problem:
\[ \min_{x \in \mathcal{C}} g(x) \]Such constraints can take the form of, non-negativity or box constraints. This can be reformulated as an unconstrained problem:\[ \min_{x} g(x) + \mathcal{I}_{\mathcal{C}}(x) \]where $\mathcal{I}_{\mathcal{C}}(x)$ is the indicator function of the convex set $\mathcal{C}$, defined as:\[ \mathcal{I}_{\mathcal{C}}(x) = \begin{cases} 0, & \text{if } x \in \mathcal{C} \\ \infty, & \text{if } x \notin \mathcal{C} \end{cases} \]References:
- Note
PGD has a worst-case complexity result of $ O(1/k) $.
- Note
A special class of optimization is of the form:
\[ \min_{x} \frac{1}{2} || A x - b ||_2^2 + ||x||_1 \]often referred to as $\ell_1$-Regularization. In this case, the proximal operator for the $\ell_1$-Regularization is the soft thresolding operator (ProximalL1). This can also be extended with constrains, such as non-negativity constraints.- See
An accerlerated version of proximal gradient descent is APGD.
- Author
Andi Braimllari - initial code
David Frank - generalization
- Template Parameters
data_t
: data type for the domain and range of the problem, defaulting to real_t
Public Functions
-
PGD
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct PGD with a least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termh
: prox-friendly functionmu
: the step length
-
PGD
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct PGD with a weighted least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termW
: the weights (usuallycounts / I0
)prox
: the proximal operator for gmu
: the step length
-
PGD
(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct PGD with a given data fidelity term
- Parameters
g
: differentiable functionh
: prox-friendly functionmu
: the step length
-
PGD
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct PGD with a least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termh
: prox-friendly functionlineSearchMethod
: the line search method to find the step size at each iteration
-
PGD
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct PGD with a weighted least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termW
: the weights (usuallycounts / I0
)prox
: the proximal operator for glineSearchMethod
: the line search method to find the step size at each iteration
-
PGD
(const Functional<data_t> &g, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct PGD with a given data fidelity term
- Parameters
g
: differentiable functionh
: prox-friendly functionlineSearchMethod
: the line search method to find the step size at each iteration
-
~PGD
() override = default¶ default destructor
-
DataContainer<data_t>
solve
(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override¶ Solve the optimization problem, i.e. apply iterations number of iterations of PGD.
- Return
the approximated solution
- Parameters
[in] iterations
: number of iterations to execute[in] x0
: optional initial solution, initial solution set to zero if not present
Protected Functions
Private Members
-
std::unique_ptr<Functional<data_t>>
g_
¶ differentiable function of problem formulation
-
std::unique_ptr<Functional<data_t>>
h_
¶ prox-friendly function of problem formulation
Accelerated Proximal Gradient Descent¶
-
template<typename
data_t
= real_t>
classelsa
::
APGD
: public elsa::Solver<real_t>¶ Accelerated Proximal Gradient Descent (APGD)
APGD minimizes function of the the same for as PGD. See the documentation there.
This class represents a APGD solver with the following steps:
$ x_{k} = prox_h(y_k - \mu * A^T (Ay_k - b)) $
$ t_{k+1} = \frac{1 + \sqrt{1 + 4 * t_{k}^2}}{2} $
$ y_{k+1} = x_{k} + (\frac{t_{k} - 1}{t_{k+1}}) * (x_{k} - x_{k - 1}) $
APGD has a worst-case complexity result of $ O(1/k^2) $.
References: http://www.cs.cmu.edu/afs/cs/Web/People/airg/readings/2012_02_21_a_fast_iterative_shrinkage-thresholding.pdf https://arxiv.org/pdf/2008.02683.pdf
- See
For a more detailed discussion of the type of problem for this solver, see PGD.
- Author
Andi Braimllari - initial code David Frank - generalization to APGD
- Template Parameters
data_t
: data type for the domain and range of the problem, defaulting to real_t
Public Functions
-
APGD
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct APGD with a least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on the source, both $ \frac{2}{L} $ and $ \frac{1}{L}$ seem common. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termprox
: the proximal operator for gmu
: the step length
-
APGD
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct APGD with a weighted least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termW
: the weights (usuallycounts / I0
)prox
: the proximal operator for gmu
: the step length
-
APGD
(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct APGD with a given data fidelity term
- Parameters
g
: differentiable function of the LASSO problemh
: prox friendly functional of the LASSO problemmu
: the step length
-
APGD
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct APGD with a least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termprox
: the proximal operator for glineSearchMethod
: the line search method to find the step size at each iteration
-
APGD
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct APGD with a weighted least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termW
: the weights (usuallycounts / I0
)prox
: the proximal operator for glineSearchMethod
: the line search method to find the step size at each iteration
-
APGD
(const Functional<data_t> &g, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct APGD with a given data fidelity term
- Parameters
g
: differentiable function of the LASSO problemh
: prox friendly functional of the LASSO problemlineSearchMethod
: the line search method to find the step size at each iteration
-
~APGD
() override = default¶ default destructor
Protected Functions
Private Members
-
std::unique_ptr<Functional<data_t>>
g_
¶ Differentiable part of the problem formulation.
-
std::unique_ptr<Functional<data_t>>
h_
¶ Prox-friendly part of the problem formulation.
Proximal Optimized Gradient Method¶
-
template<typename
data_t
= real_t>
classelsa
::
POGM
: public elsa::Solver<real_t>¶ Proximal Optimized Gradient Method (POGM)
POGM minimizes function of the the same for as PGD. See the documentation there.
The update steps for POGM are:
\[ \begin{split} \theta_k &= \begin{cases}% \frac{1}{2}(1 + \sqrt{4 \theta_{k-1}^2 + 1}) & \text{ if } 2 \leq k < N \\% \frac{1}{2}(1 + \sqrt{8 \theta_{k-1}^2 + 1}) & \text{ if } k = N % \end{cases} \\% \gamma_k &= \frac{1}{L} \frac{2\theta_{k-1} + \theta_k - 1}{\theta_k} \\% \omega_k &= x_{k+1} - \frac{1}{L} \nabla f(x_{k-1}) \\% z_k &= \omega_k +% \underset{Nesterov}{\underbrace{\frac{\theta_{k-1} - 1}{\theta_k}(\omega_k + *\omega_{k-1})}}+% \underset{OGM}{\underbrace{\frac{\theta_{k-1}}{\theta_k}(\omega_k - *x_{k-1})}}+% \underset{POGM}{\underbrace{\frac{\theta_{k-1} - 1}{L *\gamma_{k-1}\theta_k}(z_{k-1} - x_{k-1})}}\\% x_k &= \operatorname{prox}_{\gamma_k g}(z_k) \end{split} \]References:
- See
For a more detailed discussion of the type of problem for this solver, see PGD.
- Author
David Frank - Initial implementation
- Template Parameters
data_t
: data type for the domain and range of the problem, defaulting to real_t
Public Functions
-
POGM
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct POGM with a least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termg
: prox-friendly functionmu
: the step length
-
POGM
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct POGM with a weighted least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termW
: the weights (usuallycounts / I0
)prox
: the proximal operator for gmu
: the step length
-
POGM
(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct POGM with a given data fidelity term.
-
POGM
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct POGM with a least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termg
: prox-friendly functionlineSearchMethod
: the line search method to find the step size at each iteration
-
POGM
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct POGM with a weighted least squares data fidelity term
- Note
The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using
powerIterations(adjoint(A) * A)
. Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.- Parameters
A
: the operator for the least squares data termb
: the measured data for the least squares data termW
: the weights (usuallycounts / I0
)prox
: the proximal operator for glineSearchMethod
: the line search method to find the step size at each iteration
-
POGM
(const Functional<data_t> &g, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())¶ Construct POGM with a given data fidelity term.
-
~POGM
() override = default¶ default destructor
-
DataContainer<data_t>
solve
(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override¶ Solve the optimization problem, i.e. apply iterations number of iterations of POGM.
- Return
the approximated solution
- Parameters
[in] iterations
: number of iterations to execute
Protected Functions
Linearized Bregman¶
-
template<typename
data_t
= real_t>
classelsa
::
LB
: public elsa::Solver<real_t>¶ Linearized Bregman solves problems of form:
\[ \min_{x} \frac{1}{2\mu} || A x - b ||_2^2 + |x| \]by iteratively solving:
\[ \begin{aligned} x^{k+1} & =\mu \cdot \operatorname{shrink}\left(v^k, 1\right) \\ v^{k+1} & =v^k+\beta A^{\top}\left(b-A x^{k+1}\right) \end{aligned} \]References:
Public Functions
-
LB
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const ProximalOperator<data_t> &prox, data_t mu = 5, std::optional<data_t> beta = std::nullopt, data_t epsilon = 1e-5)¶ Construct Linearized Bregman.
-
~LB
() override = default¶ default destructor
Protected Functions
-
-
template<typename
data_t
= real_t>
classelsa
::
ALB
: public elsa::Solver<real_t>¶ Aссelerated Linearized Bregman converges faster than Linearized Bregman and solves problems of form:
\[ \min_{x} \frac{1}{2\mu} || A x - b ||_2^2 + |x| \]by iteratively solving:
\[ \begin{aligned} x^{k+1} & =\mu \cdot \operatorname{prox}\left(\tilde{v}^k, 1\right) \\ v^{k+1} & =\tilde{v}^k+\beta A^T\left(b-A x^{k+1}\right) \\ \tilde{v}^{k+1} & :=\alpha_k v^{k+1}+\left(1-\alpha_k\right) v^k \end{aligned} \]where $ a_k = \frac{2k + 3}{k + 3} $
References:
Public Functions
-
ALB
(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const ProximalOperator<data_t> prox, data_t mu = 5, std::optional<data_t> beta = std::nullopt, data_t epsilon = 1e-5)¶ Construct Linearized Bregman.
-
~ALB
() override = default¶ default destructor
-
DataContainer<data_t>
setup
(std::optional<DataContainer<data_t>> x) override¶ Setup variables to default values.
-
DataContainer<data_t>
step
(DataContainer<data_t> x) override¶ Perform single step of the algorithm.
-
bool
shouldStop
() const override¶ Check if the L2 distance between the residual and b is below a certain threshold.
Protected Functions
-
Alternating Direction Method of Multipliers¶
The Alternating Direction Method of Multipliers (ADMM) solves problems of the form:
With $x \in \mathbb{R}^{n}$, $z \in \mathbb{R}^{m}$, $c \in \mathbb{R}^{p}$, $A \in \mathbb{R}^{p \times n}$ and $B \in \mathbb{R}^{p \times m}$. Usually, one assumes $f$ and $g$ to be convex at least.
This problem in general is quite hard to solve for many interesting applications such as X-ray CT. However, certain special cases are quite interesting and documented below.
ADMM with L2 data fidelity term¶
-
template<typename
data_t
= real_t>
classelsa
::
ADMML2
: public elsa::Solver<real_t>¶ Class representing an Alternating Direction Method of Multipliers solver for a specific subset of constraints.
The general form of ADMM solves the following optimization problem:
\[ \min f(x) + g(z) \\ \text{s.t. } Ax + Bz = c \]with $x \in \mathbb{R}^n$, $z \in \mathbb{R}^m$, $A \in \mathbb{R}^{p\times n}$, $B \in \mathbb{R}^{p\times m}$ and $c \in \mathbb{R}^p$This specific version solves the problem of the form:
\[ \min \frac{1}{2} || Op x - b ||_2^2 + g(z) \\ \text{s.t. } Ax = z \]with $B = Id$ and $c = 0$. Further: $f(x) = || Op x - b ||_2^2$.This version of ADMM is useful, as the proximal operator is not known for the least squares functional, and this specifically implements and optimization of the first update step of ADMM. In this implementation, this is done via CGLS.
The update steps for ADMM are:
\[ x_{k+1} = \argmin_{x} \frac{1}{2}||Op x - b||_2^2 + \frac{1}{2\tau} ||Ax - z_k + u_k||_2^2 \\ z_{k+1} = \prox_{\tau g}(Ax_{k+1} + u_{k}) \\ u_{k+1} = u_k + Ax_{k+1} - z_{k+1} \]This is further useful to solve problems such as TV, by setting the $A = \nabla$. And $ g = || \dot ||_1$
References:
Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, by Boyd et al.
Chapter 5.3 of “An introduction to continuous optimization for imaging”, by Chambolle and Pock
Public Functions
-
~ADMML2
() override = default¶ default destructor
Protected Functions
Linearized ADMM¶
-
template<class
data_t
>
classelsa
::
LinearizedADMM
: public elsa::Solver<data_t>¶ Class representing the linearized version of ADMM.
The general form of ADMM solves the following optimization problem:
\[ \min f(x) + g(z) \\ \text{s.t. } Ax + Bz = c \]with $x \in \mathbb{R}^n$, $z \in \mathbb{R}^m$, $A \in \mathbb{R}^{p\times n}$, $B \in \mathbb{R}^{p\times m}$ and $c \in \mathbb{R}^p$The linearized version assumes $A = K$ to be any linear operator, $B = - Id$ and $c = 0$. Hence, the optimization problem reduces to:
\[ \min f(x) + g(Ax) \]The main benefit of this is, that in many applications both $f$ and $g$ can be simple functionals, for which proximal operators are easy to evaluate. I.e. The usual least square formulation without regularization is achieved with: $f(x) = 0$, and $g(z) = || z - b ||_2^2 $, with $ z = Kx$. For both the proximal operator is analytically known.
Many other problems can be converted to this form in a similar fashion by “stacking” the operator $K$. I.e. $L_1$ regularization can as: $f(x) = 0$, and $g(z) = || z_1 - b ||_2^2 + || z_2 ||_1$, with $ K = \begin{bmatrix} A \\ Id \end{bmatrix}$, and $z = \begin{bmatrix} z_1 \\ z_2 \end{bmatrix}$. Further, constraints can be added easily via the function $f$, by setting it to some indicator function (i.e. non-negativity or similar).
References:
Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, by Boyd et al.
Chapter 5.3 of “An introduction to continuous optimization for imaging”, by Chambolle and Pock
TODO:
Once implemented, take functionals which know their respective proximal instead of handing proximal operators directly
Public Functions
-
LinearizedADMM
(const LinearOperator<data_t> &K, ProximalOperator<data_t> proxf, ProximalOperator<data_t> proxg, SelfType_t<data_t> sigma, std::optional<data_t> tau = std::nullopt, bool computeKNorm = true)¶ Construct the linearized version of ADMM given two proximal operators for $f$ and $g$ respectively, the potentially stacked operator $K$ and the two step length parameters $\sigma$ and $\tau$.
To ensure convergence it is checked if $0 < \tau < \frac{\sigma}{||K||_2^2}$. Here $||K||_2^2$ is the operator norm, which is approximated using a couple if iterations of the power method, to apprxoximate the largest eigenvalue of the operator. If
computeKNorm
isfalse
, the computation is not performed, and it is assumed the above inequality holds.
-
~LinearizedADMM
() override = default¶ default destructor
-
DataContainer<data_t>
setup
(std::optional<DataContainer<data_t>> x) override¶ Setup and reset the solver. This function should set all values, vectors and temporaries necessary to run the iterative reconstruction algorithm.
- Return
initial solution
- Parameters
[in] x
: optional to initial solution, if optional is empty, some defaulted (e.g. 0 filled) initial solution should be returned
-
DataContainer<data_t>
step
(DataContainer<data_t> x) override¶ Perform a single step of the iterative reconstruction algorithm.
- Return
the updated solution estimate
- Parameters
[in] x
: the current solution
-
bool
shouldStop
() const override¶ Function to determine when to stop. This function should implement algorithm specific stopping criterions.
- Return
true if the algorithm should stop
-
std::string
formatHeader
() const override¶ Format the header string for the iterative reconstruction algorithm.
-
std::string
formatStep
(const DataContainer<data_t> &x) const override¶ Format the step string for the iterative reconstruction algorithm.
Private Members
-
std::unique_ptr<LinearOperator<data_t>>
K_
¶ The stacked linear operator $K$.
-
ProximalOperator<data_t>
proxf_
¶ The proximal operator for $f$.
-
ProximalOperator<data_t>
proxg_
¶ The proximal operator for $g$.
Orthogonal Matching Pursuit¶
-
template<typename
data_t
= real_t>
classelsa
::
OrthogonalMatchingPursuit
: public elsa::Solver<real_t>¶ Class representing the Orthogonal Matching Pursuit.
Orthogonal Matching Pursuit is a greedy algorithm to find a sparse representation. It starts with the 0-vector and adds one non-zero entry per iteration. The algorithm works in the following manner:
Find the next atom that should be used in the representation. This done by finding the atom that is correlated the most with the current residual.
Construct a dictionary that only contains the atoms that are being used, defined as $ D_S $ (dictionary restricted to the support).
The representation is the solution to the least square problem $ min_x \|y-D_S*x\| $
- Author
Jonas Buerger - initial code
- Template Parameters
data_t
: data type for the domain and range of the problem, defaulting to real_t
Public Functions
-
OrthogonalMatchingPursuit
(const Dictionary<data_t> &D, const DataContainer<data_t> &y, data_t epsilon)¶ Constructor for OrthogonalMatchingPursuit, accepting a dictionary representation problem and, optionally, a value for epsilon.
- Parameters
[in] D
: dictionary operator[in] y
: signal that should be sparsely represented[in] epsilon
: affects the stopping condition
-
OrthogonalMatchingPursuit
(const OrthogonalMatchingPursuit<data_t>&) = delete¶ make copy constructor deletion explicit
-
~OrthogonalMatchingPursuit
() override = default¶ default destructor
-
DataContainer<data_t>
solve
(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override¶ Solve the representation problem, i.e. apply iterations number of iterations of matching pursuit.
- Return
the approximated solution
- Parameters
[in] iterations
: number of iterations to execute. As OrthogonalMatchingPursuit is a greedy algorithm, this corresponds to the desired sparsity level[in] x0
: optional initial solution, initial solution set to zero if not present
Private Functions
helper method to find the index of the atom that is most correlated with the residual
-
OrthogonalMatchingPursuit<data_t> *
cloneImpl
() const override¶ implement the polymorphic clone operation