Skip to content

Latest commit

 

History

History
135 lines (91 loc) · 4.45 KB

File metadata and controls

135 lines (91 loc) · 4.45 KB

Flux.jl

Installation: OptimizationFlux.jl

To use this package, install the OptimizationFlux package:

import Pkg;
Pkg.add("OptimizationFlux");

!!! warn

Flux's optimizers are soon to be deprecated by [Optimisers.jl](https://github.com/FluxML/Optimisers.jl)
Because of this, we recommend using the OptimizationOptimisers.jl setup instead of OptimizationFlux.jl

Local Unconstrained Optimizers

  • Flux.Optimise.Descent: Classic gradient descent optimizer with learning rate

    • solve(problem, Descent(η))

    • η is the learning rate

    • Defaults:

      • η = 0.1
  • Flux.Optimise.Momentum: Classic gradient descent optimizer with learning rate and momentum

    • solve(problem, Momentum(η, ρ))

    • η is the learning rate

    • ρ is the momentum

    • Defaults:

      • η = 0.01
      • ρ = 0.9
  • Flux.Optimise.Nesterov: Gradient descent optimizer with learning rate and Nesterov momentum

    • solve(problem, Nesterov(η, ρ))

    • η is the learning rate

    • ρ is the Nesterov momentum

    • Defaults:

      • η = 0.01
      • ρ = 0.9
  • Flux.Optimise.RMSProp: RMSProp optimizer

    • solve(problem, RMSProp(η, ρ))

    • η is the learning rate

    • ρ is the momentum

    • Defaults:

      • η = 0.001
      • ρ = 0.9
  • Flux.Optimise.ADAM: ADAM optimizer

    • solve(problem, ADAM(η, β::Tuple))

    • η is the learning rate

    • β::Tuple is the decay of momentums

    • Defaults:

      • η = 0.001
      • β::Tuple = (0.9, 0.999)
  • Flux.Optimise.RADAM: Rectified ADAM optimizer

    • solve(problem, RADAM(η, β::Tuple))

    • η is the learning rate

    • β::Tuple is the decay of momentums

    • Defaults:

      • η = 0.001
      • β::Tuple = (0.9, 0.999)
  • Flux.Optimise.AdaMax: AdaMax optimizer

    • solve(problem, AdaMax(η, β::Tuple))

    • η is the learning rate

    • β::Tuple is the decay of momentums

    • Defaults:

      • η = 0.001
      • β::Tuple = (0.9, 0.999)
  • Flux.Optimise.ADAGRad: ADAGrad optimizer

    • solve(problem, ADAGrad(η))

    • η is the learning rate

    • Defaults:

      • η = 0.1
  • Flux.Optimise.ADADelta: ADADelta optimizer

    • solve(problem, ADADelta(ρ))

    • ρ is the gradient decay factor

    • Defaults:

      • ρ = 0.9
  • Flux.Optimise.AMSGrad: AMSGrad optimizer

    • solve(problem, AMSGrad(η, β::Tuple))

    • η is the learning rate

    • β::Tuple is the decay of momentums

    • Defaults:

      • η = 0.001
      • β::Tuple = (0.9, 0.999)
  • Flux.Optimise.NADAM: Nesterov variant of the ADAM optimizer

    • solve(problem, NADAM(η, β::Tuple))

    • η is the learning rate

    • β::Tuple is the decay of momentums

    • Defaults:

      • η = 0.001
      • β::Tuple = (0.9, 0.999)
  • Flux.Optimise.ADAMW: ADAMW optimizer

    • solve(problem, ADAMW(η, β::Tuple))

    • η is the learning rate

    • β::Tuple is the decay of momentums

    • decay is the decay to weights

    • Defaults:

      • η = 0.001
      • β::Tuple = (0.9, 0.999)
      • decay = 0