OffPolicyConfig

Struct OffPolicyConfig

pub struct OffPolicyConfig {
    pub num_envs: usize,
    pub autobatch_size: usize,
    pub replay_buffer_size: usize,
    pub train_interval: usize,
    pub train_steps: usize,
    pub eval_interval: usize,
    pub eval_episodes: usize,
    pub train_batch_size: usize,
    pub warmup_steps: usize,
}

Expand description

Parameters of an on policy training with multi environments and double-batching.

Fields§

§num_envs: usize

The number of environments to run simultaneously for experience collection.

§autobatch_size: usize

Number of environment state to accumulate before running one step of inference with the policy. Must be equal or less than the number of simultaneous environments.

§replay_buffer_size: usize

Max number of transitions stored in the replay buffer.

§train_interval: usize

The number of steps to collect between each step of training.

§train_steps: usize

Number of optimization steps done each train_interval.

§eval_interval: usize

The number of steps to collect between each evaluation.

§eval_episodes: usize

The number of episodes to run for each evaluation.

§train_batch_size: usize

The number of transition to train on.

§warmup_steps: usize

Number of steps to collect before starting to train.

Implementations§

impl OffPolicyConfig

pub fn new() -> OffPolicyConfig

Create a new instance of the config.

§Arguments

§Default Arguments

§`num_envs`

The number of environments to run simultaneously for experience collection.

Defaults to 1

§`autobatch_size`

Number of environment state to accumulate before running one step of inference with the policy. Must be equal or less than the number of simultaneous environments.

Defaults to 1

§`replay_buffer_size`

Max number of transitions stored in the replay buffer.

Defaults to 1024

§`train_interval`

The number of steps to collect between each step of training.

Defaults to 1

§`train_steps`

Number of optimization steps done each train_interval.

Defaults to 1

§`eval_interval`

The number of steps to collect between each evaluation.

Defaults to 10_000

§`eval_episodes`

The number of episodes to run for each evaluation.

Defaults to 1

§`train_batch_size`

The number of transition to train on.

Defaults to 32

§`warmup_steps`

Number of steps to collect before starting to train.

Defaults to 0

impl OffPolicyConfig

pub fn with_num_envs(self, num_envs: usize) -> OffPolicyConfig

Sets the value for the field num_envs.

The number of environments to run simultaneously for experience collection.

Defaults to 1

pub fn with_autobatch_size(self, autobatch_size: usize) -> OffPolicyConfig

Sets the value for the field autobatch_size.

Number of environment state to accumulate before running one step of inference with the policy. Must be equal or less than the number of simultaneous environments.

Defaults to 1

pub fn with_replay_buffer_size( self, replay_buffer_size: usize, ) -> OffPolicyConfig

Sets the value for the field replay_buffer_size.

Max number of transitions stored in the replay buffer.

Defaults to 1024

pub fn with_train_interval(self, train_interval: usize) -> OffPolicyConfig

Sets the value for the field train_interval.

The number of steps to collect between each step of training.

Defaults to 1

pub fn with_train_steps(self, train_steps: usize) -> OffPolicyConfig

Sets the value for the field train_steps.

Number of optimization steps done each train_interval.

Defaults to 1

pub fn with_eval_interval(self, eval_interval: usize) -> OffPolicyConfig

Sets the value for the field eval_interval.

The number of steps to collect between each evaluation.

Defaults to 10_000

pub fn with_eval_episodes(self, eval_episodes: usize) -> OffPolicyConfig

Sets the value for the field eval_episodes.

The number of episodes to run for each evaluation.

Defaults to 1

pub fn with_train_batch_size(self, train_batch_size: usize) -> OffPolicyConfig

Sets the value for the field train_batch_size.

The number of transition to train on.

Defaults to 32

pub fn with_warmup_steps(self, warmup_steps: usize) -> OffPolicyConfig

Sets the value for the field warmup_steps.

Number of steps to collect before starting to train.

Defaults to 0

Trait Implementations§

impl Clone for OffPolicyConfig

fn clone(&self) -> OffPolicyConfig

Returns a duplicate of the value. Read more

1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Config for OffPolicyConfig

fn save<P>(&self, file: P) -> Result<(), Error>
where P: AsRef<Path>,

Saves the configuration to a file. Read more

fn load<P>(file: P) -> Result<Self, ConfigError>
where P: AsRef<Path>,

Loads the configuration from a file. Read more

fn load_binary(data: &[u8]) -> Result<Self, ConfigError>

Loads the configuration from a binary buffer. Read more

impl Debug for OffPolicyConfig

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more

impl<'de> Deserialize<'de> for OffPolicyConfig

fn deserialize<D>( deserializer: D, ) -> Result<OffPolicyConfig, <D as Deserializer<'de>>::Error>
where D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more

impl Display for OffPolicyConfig

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more

impl Serialize for OffPolicyConfig

fn serialize<S>( &self, serializer: S, ) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

impl Freeze for OffPolicyConfig

impl RefUnwindSafe for OffPolicyConfig

impl Send for OffPolicyConfig

impl Sync for OffPolicyConfig

impl Unpin for OffPolicyConfig

impl UnwindSafe for OffPolicyConfig

Blanket Implementations§

impl<T> Adaptor<()> for T

fn adapt(&self)

Adapt the type to be passed to a metric.

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<C> CloneExpand for C
where C: Clone,

fn __expand_clone_method(&self, _scope: &mut Scope) -> C

impl<T> CloneToUninit for T
where T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

impl<T> Downcast<T> for T

fn downcast(&self) -> &T

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> IntoComptime for T

fn comptime(self) -> Self

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

impl<T> Pointable for T

const ALIGN: usize

The alignment of pointer.

type Init = T

The type for initializers.

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more

impl<T> ToCompactString for T
where T: Display,

fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>

Fallible version of [ToCompactString::to_compact_string()] Read more

fn to_compact_string(&self) -> CompactString

Converts the given value to a [CompactString]. Read more

impl<T> ToLine for T
where T: Display,

fn to_line(&self) -> Line<'_>

Converts the value to a [Line].

impl<T> ToOwned for T
where T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T> ToSpan for T
where T: Display,

fn to_span(&self) -> Span<'_>

Converts the value to a [Span].

impl<T> ToString for T
where T: Display + ?Sized,

fn to_string(&self) -> String

Converts the given value to a String. Read more

impl<T> ToText for T
where T: Display,

fn to_text(&self) -> Text<'_>

Converts the value to a [Text].

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<T> TuneInputs for T
where T: Clone + Send + Sync + 'static,

type At<'a> = T

The concrete input type at lifetime 'a.

impl<T> Upcast<T> for T

fn upcast(&self) -> Option<&T>

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn vzip(self) -> V

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

impl<T> WasmNotSend for T
where T: Send,

impl<T> WasmNotSendSync for T
where T: WasmNotSend + WasmNotSync,

impl<T> WasmNotSync for T
where T: Sync,