Struct OffPolicyConfig
pub struct OffPolicyConfig {
pub num_envs: usize,
pub autobatch_size: usize,
pub replay_buffer_size: usize,
pub train_interval: usize,
pub train_steps: usize,
pub eval_interval: usize,
pub eval_episodes: usize,
pub train_batch_size: usize,
pub warmup_steps: usize,
}Expand description
Parameters of an on policy training with multi environments and double-batching.
Fields§
§num_envs: usizeThe number of environments to run simultaneously for experience collection.
autobatch_size: usizeNumber of environment state to accumulate before running one step of inference with the policy. Must be equal or less than the number of simultaneous environments.
replay_buffer_size: usizeMax number of transitions stored in the replay buffer.
train_interval: usizeThe number of steps to collect between each step of training.
train_steps: usizeNumber of optimization steps done each train_interval.
eval_interval: usizeThe number of steps to collect between each evaluation.
eval_episodes: usizeThe number of episodes to run for each evaluation.
train_batch_size: usizeThe number of transition to train on.
warmup_steps: usizeNumber of steps to collect before starting to train.
Implementations§
§impl OffPolicyConfig
impl OffPolicyConfig
pub fn new() -> OffPolicyConfig
pub fn new() -> OffPolicyConfig
Create a new instance of the config.
§Arguments
§Default Arguments
§num_envs
The number of environments to run simultaneously for experience collection.
- Defaults to
1
§autobatch_size
Number of environment state to accumulate before running one step of inference with the policy. Must be equal or less than the number of simultaneous environments.
- Defaults to
1
§replay_buffer_size
Max number of transitions stored in the replay buffer.
- Defaults to
1024
§train_interval
The number of steps to collect between each step of training.
- Defaults to
1
§train_steps
Number of optimization steps done each train_interval.
- Defaults to
1
§eval_interval
The number of steps to collect between each evaluation.
- Defaults to
10_000
§eval_episodes
The number of episodes to run for each evaluation.
- Defaults to
1
§train_batch_size
The number of transition to train on.
- Defaults to
32
§warmup_steps
Number of steps to collect before starting to train.
- Defaults to
0
§impl OffPolicyConfig
impl OffPolicyConfig
pub fn with_num_envs(self, num_envs: usize) -> OffPolicyConfig
pub fn with_num_envs(self, num_envs: usize) -> OffPolicyConfig
Sets the value for the field num_envs.
The number of environments to run simultaneously for experience collection.
- Defaults to
1
pub fn with_autobatch_size(self, autobatch_size: usize) -> OffPolicyConfig
pub fn with_autobatch_size(self, autobatch_size: usize) -> OffPolicyConfig
Sets the value for the field autobatch_size.
Number of environment state to accumulate before running one step of inference with the policy. Must be equal or less than the number of simultaneous environments.
- Defaults to
1
pub fn with_replay_buffer_size(
self,
replay_buffer_size: usize,
) -> OffPolicyConfig
pub fn with_replay_buffer_size( self, replay_buffer_size: usize, ) -> OffPolicyConfig
Sets the value for the field replay_buffer_size.
Max number of transitions stored in the replay buffer.
- Defaults to
1024
pub fn with_train_interval(self, train_interval: usize) -> OffPolicyConfig
pub fn with_train_interval(self, train_interval: usize) -> OffPolicyConfig
Sets the value for the field train_interval.
The number of steps to collect between each step of training.
- Defaults to
1
pub fn with_train_steps(self, train_steps: usize) -> OffPolicyConfig
pub fn with_train_steps(self, train_steps: usize) -> OffPolicyConfig
Sets the value for the field train_steps.
Number of optimization steps done each train_interval.
- Defaults to
1
pub fn with_eval_interval(self, eval_interval: usize) -> OffPolicyConfig
pub fn with_eval_interval(self, eval_interval: usize) -> OffPolicyConfig
Sets the value for the field eval_interval.
The number of steps to collect between each evaluation.
- Defaults to
10_000
pub fn with_eval_episodes(self, eval_episodes: usize) -> OffPolicyConfig
pub fn with_eval_episodes(self, eval_episodes: usize) -> OffPolicyConfig
Sets the value for the field eval_episodes.
The number of episodes to run for each evaluation.
- Defaults to
1
pub fn with_train_batch_size(self, train_batch_size: usize) -> OffPolicyConfig
pub fn with_train_batch_size(self, train_batch_size: usize) -> OffPolicyConfig
pub fn with_warmup_steps(self, warmup_steps: usize) -> OffPolicyConfig
pub fn with_warmup_steps(self, warmup_steps: usize) -> OffPolicyConfig
Sets the value for the field warmup_steps.
Number of steps to collect before starting to train.
- Defaults to
0
Trait Implementations§
§impl Clone for OffPolicyConfig
impl Clone for OffPolicyConfig
§fn clone(&self) -> OffPolicyConfig
fn clone(&self) -> OffPolicyConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more§impl Config for OffPolicyConfig
impl Config for OffPolicyConfig
§impl Debug for OffPolicyConfig
impl Debug for OffPolicyConfig
§impl<'de> Deserialize<'de> for OffPolicyConfig
impl<'de> Deserialize<'de> for OffPolicyConfig
§fn deserialize<D>(
deserializer: D,
) -> Result<OffPolicyConfig, <D as Deserializer<'de>>::Error>where
D: Deserializer<'de>,
fn deserialize<D>(
deserializer: D,
) -> Result<OffPolicyConfig, <D as Deserializer<'de>>::Error>where
D: Deserializer<'de>,
§impl Display for OffPolicyConfig
impl Display for OffPolicyConfig
§impl Serialize for OffPolicyConfig
impl Serialize for OffPolicyConfig
§fn serialize<S>(
&self,
serializer: S,
) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>where
S: Serializer,
fn serialize<S>(
&self,
serializer: S,
) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>where
S: Serializer,
Auto Trait Implementations§
impl Freeze for OffPolicyConfig
impl RefUnwindSafe for OffPolicyConfig
impl Send for OffPolicyConfig
impl Sync for OffPolicyConfig
impl Unpin for OffPolicyConfig
impl UnwindSafe for OffPolicyConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<C> CloneExpand for Cwhere
C: Clone,
impl<C> CloneExpand for Cwhere
C: Clone,
fn __expand_clone_method(&self, _scope: &mut Scope) -> C
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more§impl<T> Pointable for T
impl<T> Pointable for T
§impl<T> ToCompactString for Twhere
T: Display,
impl<T> ToCompactString for Twhere
T: Display,
§fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>
fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>
ToCompactString::to_compact_string()] Read more§fn to_compact_string(&self) -> CompactString
fn to_compact_string(&self) -> CompactString
CompactString]. Read more