MuonConfig

Struct MuonConfig 

pub struct MuonConfig { /* private fields */ }
Expand description

Muon configuration.

Muon is an optimizer specifically designed for 2D parameters of neural network hidden layers (weight matrices). Other parameters such as biases and embeddings should be optimized using a standard method such as AdamW.

§Learning Rate Adjustment

Muon adjusts the learning rate based on parameter shape to maintain consistent RMS across rectangular matrices. Two methods are available:

  • Original: Uses sqrt(max(1, A/B)) where A and B are the first two dimensions. This is Keller Jordan’s method and is the default.

  • MatchRmsAdamW: Uses 0.2 * sqrt(max(A, B)). This is Moonshot’s method designed to match AdamW’s RMS, allowing direct reuse of AdamW hyperparameters.

§Example

use burn_optim::{MuonConfig, AdjustLrFn};

// Using default (Original) method
let optimizer = MuonConfig::new().init();

// Using MatchRmsAdamW for AdamW-compatible hyperparameters
let optimizer = MuonConfig::new()
    .with_adjust_lr_fn(AdjustLrFn::MatchRmsAdamW)
    .init();

§References

Implementations§

§

impl MuonConfig

pub fn new() -> MuonConfig

Create a new instance of the config.

§

impl MuonConfig

pub fn with_momentum(self, momentum: MomentumConfig) -> MuonConfig

Momentum config.

pub fn with_ns_coefficients( self, ns_coefficients: (f32, f32, f32), ) -> MuonConfig

Newton-Schulz iteration coefficients (a, b, c).

pub fn with_epsilon(self, epsilon: f32) -> MuonConfig

Epsilon for numerical stability.

pub fn with_ns_steps(self, ns_steps: usize) -> MuonConfig

Number of Newton-Schulz iteration steps.

pub fn with_adjust_lr_fn(self, adjust_lr_fn: AdjustLrFn) -> MuonConfig

Learning rate adjustment method.

pub fn with_weight_decay( self, weight_decay: Option<WeightDecayConfig>, ) -> MuonConfig

Set the default value for the field.

§

impl MuonConfig

pub fn init<B, M>( &self, ) -> OptimizerAdaptor<Muon<<B as AutodiffBackend>::InnerBackend>, M, B>

Initialize Muon optimizer.

§Returns

Returns an optimizer adaptor that can be used to optimize a module.

§Example
use burn_optim::{MuonConfig, AdjustLrFn, decay::WeightDecayConfig};

// Basic configuration with default (Original) LR adjustment
let optimizer = MuonConfig::new()
    .with_weight_decay(Some(WeightDecayConfig::new(0.01)))
    .init();

// With AdamW-compatible settings using MatchRmsAdamW
let optimizer = MuonConfig::new()
    .with_adjust_lr_fn(AdjustLrFn::MatchRmsAdamW)
    .with_weight_decay(Some(WeightDecayConfig::new(0.1)))
    .init();

// Custom momentum and NS settings
let optimizer = MuonConfig::new()
    .with_momentum(MomentumConfig {
        momentum: 0.9,
        dampening: 0.1,
        nesterov: false,
    })
    .with_ns_steps(7)
    .init();

Trait Implementations§

§

impl Clone for MuonConfig

§

fn clone(&self) -> MuonConfig

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
§

impl Config for MuonConfig

§

fn save<P>(&self, file: P) -> Result<(), Error>
where P: AsRef<Path>,

Saves the configuration to a file. Read more
§

fn load<P>(file: P) -> Result<Self, ConfigError>
where P: AsRef<Path>,

Loads the configuration from a file. Read more
§

fn load_binary(data: &[u8]) -> Result<Self, ConfigError>

Loads the configuration from a binary buffer. Read more
§

impl Debug for MuonConfig

§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
§

impl<'de> Deserialize<'de> for MuonConfig

§

fn deserialize<D>( deserializer: D, ) -> Result<MuonConfig, <D as Deserializer<'de>>::Error>
where D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
§

impl Display for MuonConfig

§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
§

impl Serialize for MuonConfig

§

fn serialize<S>( &self, serializer: S, ) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

§

impl<T> Adaptor<()> for T

§

fn adapt(&self)

Adapt the type to be passed to a metric.
Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
§

impl<T> Downcast<T> for T

§

fn downcast(&self) -> &T

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

§

impl<T> IntoComptime for T

§

fn comptime(self) -> Self

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
§

impl<T> ToCompactString for T
where T: Display,

§

fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>

Fallible version of [ToCompactString::to_compact_string()] Read more
§

fn to_compact_string(&self) -> CompactString

Converts the given value to a [CompactString]. Read more
§

impl<T> ToLine for T
where T: Display,

§

fn to_line(&self) -> Line<'_>

Converts the value to a [Line].
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
§

impl<T> ToSpan for T
where T: Display,

§

fn to_span(&self) -> Span<'_>

Converts the value to a [Span].
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
§

impl<T> ToText for T
where T: Display,

§

fn to_text(&self) -> Text<'_>

Converts the value to a [Text].
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> Upcast<T> for T

§

fn upcast(&self) -> Option<&T>

§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

§

impl<T> WasmNotSend for T
where T: Send,

§

impl<T> WasmNotSendSync for T
where T: WasmNotSend + WasmNotSync,

§

impl<T> WasmNotSync for T
where T: Sync,