argmin 0.6.0 and argmin-math 0.1.0 released
Posted August 1, 2022 by Stefan Kroboth ‐ 12 min read
argmin offers a range of numerical optimization methods in Rust.
Due to other commitments argmin didn't get as much attention as it should have in previous months and years, until the end of 2021. Having been unhappy with certain design choices for quite some time, I decided to redesign several aspects of the library. These changes have accumulated in argmin
version 0.6.0 and a new crate called argmin-math
. This blog post is an attempt to summarize what has happened; however this is by no means exhaustive.
These changes had quite a substantial impact on the API and as such are very likely to affect you as a user. For users of the optimization algorithms, adapting your code to the new version should be a fairly easy task with an occasional look at the documentation or the current examples in the Github repository. If you are using argmin to implement your own algorithms, the changes you will have to make are more severe. I suggest having a look at the source code of argmin to get a sense of how you may have to adapt your own code.
In either case, feel free to get in touch if you run into trouble, either on Gitter or by opening an issue.
argmin-math
One major change is that all math/algebra related code was split off into a dedicated crate called argmin-math
. This includes all mathematics traits as well as their implementations for various backends such as ndarray, nalgebra and basic Vec
s. The desired backend is chosen via features (multiple versions of each backend are supported). One just needs to add argmin-math
with the appropriate feature to Cargo.toml
to enable the desired backend:
[dependencies]
argmin = "0.6.0"
argmin-math = { version = "0.1.0", features = ["ndarray_v0_15-serde"] }
The argmin-math documentation lists all available features.
ArgminOp replacement
In prior versions, the ArgminOp
trait had to be implemented for optimization problems in order to work with argmin. It covered all (potentially) necessary operations, from computing the cost function, gradients, Jacobians and Hessians to annealing of parameter vectors in Simulated Annealing. Users would only implement the methods which were needed by the desired solver and leave the others unimplemented, which defaulted to panicking when called. This had a couple of disadvantages: Firstly, it made it impossible for the solvers to communicate to the user via the trait system which methods were actually needed. Secondly, it was limiting in the sense that the trait was not extendable by users which use argmin to implement their own solvers.
Therefore it was replaced by the dedicated traits CostFunction
, Gradient
, Jacobian
, Hessian
, Operator
, and Anneal
. The solvers can now precisely define which traits must be implemented for the user-defined optimization problem. More traits can easily be added if future solvers require them.
Bulk computation
Cost functions, gradients and the like can now be computed in bulk for multiple input parameter vectors. Each of the traits mentioned in the previous section provides an automatically implemented bulk_*
method for this purpose (which can be overwritten if needed). For instance, in case of the CostFunction
trait, the cost
method computes the cost function of a single parameter vector while bulk_cost
does the same for multiple parameter vectors. Bulk computation however must be supported by the solver, which so far is only Particle Swarm Optimization (but will also be supported by the upcoming CMA-ES solver and other future population based solvers).
By default the individual parameter vectors are processed sequentially. Enabling the rayon
feature enables parallel computation using rayon, which can lead to a substantial performance boost depending on your problem and your hardware. Even if rayon
is enabled, parallel processing can selectively (for particular traits) be turned off if needed (for cases where for instance computing the CostFunction
in parallel is desired, but not for Gradient
). The implementation is the result of a lot of discussion on Github which was very helpful in fleshing out the requirements of this feature. Thanks to everyone who participated in this discussion!
serde is now optional
Serialization and deserialization is needed for some observers and for checkpointing. In previous versions this affected users even if no observers and no checkpointing were used. Mainly because it required the user-defined optimization problem to implement Serialize
and Deserialize
, which constitutes additional effort and may not be feasible in particular cases. Additionally, serde
increases the compilation time. Therefore serde
has been made optional via the serde1
feature which is by default enabled. This comes with the caveat that some observers and checkpointing are not available with a disabled serde1
feature.
Internal state
Solvers have access to an internal state which is carried from one iteration to the next. This state keeps track of current and best-so-far parameter vectors, cost function values, gradients, and so on as well as function evaluation counts and the number of iterations. In previous versions only IterState
was available which tried to accommodate the needs of all solvers. However, the demands of the solvers on this internal state can vary quite a lot, which lead to a somewhat awkward and potentially confusing API.
By being generic over the states, the particular kind of state can now be chosen for each solver individually. IterState
is still general enough to serve most solvers, but for population based methods the more appropriate PopulationState
was introduced. Note that the latter is so far only in use in Particle Swarm Optimization and may change substantially when other population based solvers are implemented. More states may follow as more optimization methods are implemented. All states must offer an interface defined by the State
trait which covers basic functionality. This change will typically only affect you if you use argmin to implement your own solvers.
Initialization of solvers
The latter point also affects how solvers are initialized by a user. While each solver is configured via its constructor and the associated methods, setting problem-related parameters such as initial parameter vectors, gradients and the like is done by initializing the internal state. In previous versions, the Executor
required one to provide an initial guess for the parameter vector in its constructor, which was awkward for solvers which do not require (or even accept) an initial guess.
Therefore the Executor
s constructor was changed to only accept the solver and the optimization problem. Configuration of the internal state is done via the configure
method of Executor
, where one gets access to the internal state and its methods via a closure. The type of state offered by configure
depends on the type of state used by the chosen solver.
The following small example illustrates the changes. Setting up an executor in a version prior to 0.6.0 looked like this:
let result = Executor::new(problem, solver, initial_param)
.max_iters(1000)
.run()?;
As you can see, the initial parameter vector initial_param
was provided to the constructor and the maximum number of iterations was configured via a method of the Executor
. In the new version 0.6.0 this changed to:
let result = Executor::new(problem, solver)
.configure(|state| state.param(initial_param).max_iters(1000))
.run()?;
Both initial parameter vector and the maximum number of iterations are directly provided to the state
using Executor
s configure
method.
Note that depending on the solver, more aspects can be initialized. For instance, one can also provide an initial gradient which skips the first computation of the gradient.
Checkpointing
Checkpointing was redesigned from the ground up and is now substiantially easier to configure:
let result = Executor::new(problem, solver)
.configure(|state| state.param(initial_param).max_iters(1000))
.checkpointing(checkpoint)
.run()?;
The variable checkpoint
represents an instance of something that implements the Checkpoint
trait. argmin itself offers FileCheckpoint
which saves the checkpoint to disk. Thanks to the Checkpoint
trait you can also implement your own checkpointing mechanisms.
Problem replaces OpWrapper
OpWrapper
was used internally by the Executor
to wrap the user-defined optimization problem and count the number of function evaluations. It was renamed to Problem
and comes with a more consistent interface. Problem
itself implements CostFunction
, Gradient
, Jacobian
, Hessian
, Operator
, and Anneal
and can thus be used like the user-defined problem itself. Problem
still counts the function evaluations (also for bulk evaluation).
This change is unlikely to affect you, unless you use argmin to implement your own algorithms.
Brent's optimization method
@Armavica added Brent's optimization method (Thanks!). In the course of this the previously implemented Brent's method (the root finding method) was renamed to BrentRoot
.
Documentation
Adding missing documentation after years of development really teaches you a lesson. I've spent countless hours adding (hopefully useful) documentation and doctests to every struct, method, function, module and trait. The documentation on docs.rs now covers the API and the newly created argmin book is a set of loosely coupled tutorials on how to use different aspects of the library. I plan to continuously improve and add to the book such that it eventually becomes a valuable resource for anyone using argmin.
While the situation has improved a lot in terms of documentation, there are still corners which could use a bit more love. Feedback by users is essential for improving the quality of both the book and the documentation. Therefore please get in touch if you have suggestions or if you even want to contribute to the documentation. This is highly appreciated.
I've also updated the website, which still isn't great design-wise, but time is limited and there are more enjoyable things to do ;-)
Other changes
- Removed the
prelude
module. Alternativelyargmin::core::*
can be used, although it is not recommended. - Some interfaces were redesigned to avoid cloning.
ArgminResult
was renamed toOptimizationResult
and contains the solver, the problem and the final state of the optimization run.- The
Solver
trait was redesigned to accommodate for the new handling of (generic) internal state. KV
does not storeString
s anymore, but instead elements which aredyn Display
. Conversion into strings is done by the observer, which has the advantage that there are no computational costs for conversion when no observer is used. This will likely be further improved in the future, where theKV
will be able to store any type.- Improved the API surface to be more consistent in terms of naming.
- Particle Swarm Optimization was redesigned and uses
PopulationState
internally. Also, it is now able to compute the cost function for all individuals in the population in parallel using the above mentioned bulk processing methods. - Some of the solvers had unnecessary trait bounds which were removed.
- Removed many
unwrap()
s and replaced them with meaningful errors. - The optional
stdweb
dependency was removed. - Many more tests and doc tests were added.
ArgminError
is nownon_exhaustive
.- nalgebra 0.31 is supported.
- Many now useless macros were removed.
- Improved CI.
- Added a Code of Conduct (finally).
- Created a new logo which only really works in dark mode. Who still uses light mode anyways? ;-)
- Fixed a bug in Nelder-Mead reported by @bivhitscar (Thanks!).
- Unnecessary
Default
trait bound onTrustRegion
andSteihaug
was removed by @cfunky (Thanks!). - All argmin crates now follow SemVer strictly (I have to admit that it needed complaints to fully commit to SemVer).
Conclusion
Version 0.6.0 is a huge release for me personally because I got to make some substantial improvements to the library which hopefully made the interface more consistent and the entire code base easier to maintain. My hope is that the library is now better approachable by both users and potential contributors. Some design aspects which need to be addressed remain, and new problems will certainly arise, but for now I hope to be able to spend my time on more fun aspects of this project.
What to expect next
In the future you can expect to see more regular releases, with fewer severe changes.
One of the next steps will be a redesign of how stopping criteria work. The goal is to enable users to supply their own stopping criteria.
I look forward to implementing more solvers, in particular for constrained optimization which is not yet possible (apart from PSO and SA).
An implementation of CMA-ES by @VolodymyrOrlov is almost ready to be merged.
I hope that the changes and the improved documentation make it easier for users to give feedback and for potential contributors to join the project. Please don't hesitate to get in touch if you are looking for help, have feedback or want to contribute.
I also plan to share potential projects on the issue tracker and label them as good-first-issue
if appropriate. I will offer mentoring for newcomers if my schedule allows it.
Thanks again to everyone who contributed by issuing PRs, giving feedback and discussing issues! I really appreciate it.