Struct burn::nn::attention::MultiHeadAttention
pub struct MultiHeadAttention<B>where
B: Backend,{
pub query: Linear<B>,
pub key: Linear<B>,
pub value: Linear<B>,
pub output: Linear<B>,
pub dropout: Dropout,
pub activation: Gelu,
pub d_model: usize,
pub n_heads: usize,
pub d_k: usize,
pub min_float: f64,
pub quiet_softmax: bool,
}
Expand description
The multihead attention module as describe in the paper Attention Is All You Need.
§Params
- query: Linear layer with
d_model
input and output features. - key: Linear layer with
d_model
input and output features. - value: Linear layer with
d_model
input and output features. - output: Linear layer with
d_model
input and output features.
Should be created with MultiHeadAttentionConfig.
Fields§
§query: Linear<B>
Linear layer to transform the input features into the query space.
key: Linear<B>
Linear layer to transform the input features into the key space.
value: Linear<B>
Linear layer to transform the input features into the value space.
output: Linear<B>
Linear layer to transform the output features back to the original space.
dropout: Dropout
Dropout layer.
activation: Gelu
Activation function.
d_model: usize
The size of each linear layer.
n_heads: usize
The number of heads.
d_k: usize
Size of the key and query vectors.
min_float: f64
Minimum value a float can take.
quiet_softmax: bool
Use “quiet softmax” instead of regular softmax.
Implementations§
§impl<B> MultiHeadAttention<B>where
B: Backend,
impl<B> MultiHeadAttention<B>where
B: Backend,
pub fn forward(&self, input: MhaInput<B>) -> MhaOutput<B>
pub fn forward(&self, input: MhaInput<B>) -> MhaOutput<B>
Applies the forward pass on the input tensors.
See MultiHeadAttention for more information.
§Shapes
- query:
[batch_size, seq_length_1, d_model]
- key:
[batch_size, seq_length_2, d_model]
- value:
[batch_size, seq_length_2, d_model]
- output:
[batch_size, seq_length_1, d_model]
pub fn forward_cache(
&self,
input: MhaInput<B>,
cache: &mut MhaCache<B>,
) -> MhaOutput<B>
pub fn forward_cache( &self, input: MhaInput<B>, cache: &mut MhaCache<B>, ) -> MhaOutput<B>
Applies the forward pass using a cache.
§Shapes
- query:
[batch_size, seq_length_1, d_model]
- key:
[batch_size, seq_length_2, d_model]
- value:
[batch_size, seq_length_2, d_model]
- output:
[batch_size, seq_length_1, d_model]
Trait Implementations§
§impl<B> AutodiffModule<B> for MultiHeadAttention<B>
impl<B> AutodiffModule<B> for MultiHeadAttention<B>
§type InnerModule = MultiHeadAttention<<B as AutodiffBackend>::InnerBackend>
type InnerModule = MultiHeadAttention<<B as AutodiffBackend>::InnerBackend>
Inner module without auto-differentiation.
§fn valid(&self) -> <MultiHeadAttention<B> as AutodiffModule<B>>::InnerModule
fn valid(&self) -> <MultiHeadAttention<B> as AutodiffModule<B>>::InnerModule
Get the same module, but on the inner backend without auto-differentiation.
§impl<B> Clone for MultiHeadAttention<B>where
B: Backend,
impl<B> Clone for MultiHeadAttention<B>where
B: Backend,
§fn clone(&self) -> MultiHeadAttention<B>
fn clone(&self) -> MultiHeadAttention<B>
Returns a copy of the value. Read more
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source
. Read more§impl<B> Debug for MultiHeadAttention<B>
impl<B> Debug for MultiHeadAttention<B>
§impl<B> Display for MultiHeadAttention<B>where
B: Backend,
impl<B> Display for MultiHeadAttention<B>where
B: Backend,
§impl<B> Module<B> for MultiHeadAttention<B>where
B: Backend,
impl<B> Module<B> for MultiHeadAttention<B>where
B: Backend,
§type Record = MultiHeadAttentionRecord<B>
type Record = MultiHeadAttentionRecord<B>
Type to save and load the module.
§fn load_record(
self,
record: <MultiHeadAttention<B> as Module<B>>::Record,
) -> MultiHeadAttention<B>
fn load_record( self, record: <MultiHeadAttention<B> as Module<B>>::Record, ) -> MultiHeadAttention<B>
Load the module state from a record.
§fn into_record(self) -> <MultiHeadAttention<B> as Module<B>>::Record
fn into_record(self) -> <MultiHeadAttention<B> as Module<B>>::Record
Convert the module into a record containing the state.
§fn num_params(&self) -> usize
fn num_params(&self) -> usize
Get the number of parameters the module has, including all of its sub-modules.
§fn visit<Visitor>(&self, visitor: &mut Visitor)where
Visitor: ModuleVisitor<B>,
fn visit<Visitor>(&self, visitor: &mut Visitor)where
Visitor: ModuleVisitor<B>,
Visit each tensor parameter in the module with a visitor.
§fn map<Mapper>(self, mapper: &mut Mapper) -> MultiHeadAttention<B>where
Mapper: ModuleMapper<B>,
fn map<Mapper>(self, mapper: &mut Mapper) -> MultiHeadAttention<B>where
Mapper: ModuleMapper<B>,
Map each tensor parameter in the module with a mapper.
§fn collect_devices(
&self,
devices: Vec<<B as Backend>::Device>,
) -> Vec<<B as Backend>::Device>
fn collect_devices( &self, devices: Vec<<B as Backend>::Device>, ) -> Vec<<B as Backend>::Device>
Return all the devices found in the underneath module tree added to the given vector
without duplicates.
§fn to_device(self, device: &<B as Backend>::Device) -> MultiHeadAttention<B>
fn to_device(self, device: &<B as Backend>::Device) -> MultiHeadAttention<B>
Move the module and all of its sub-modules to the given device. Read more
§fn fork(self, device: &<B as Backend>::Device) -> MultiHeadAttention<B>
fn fork(self, device: &<B as Backend>::Device) -> MultiHeadAttention<B>
Fork the module and all of its sub-modules to the given device. Read more
§fn devices(&self) -> Vec<<B as Backend>::Device>
fn devices(&self) -> Vec<<B as Backend>::Device>
Return all the devices found in the underneath module tree without duplicates.
§fn save_file<FR, PB>(
self,
file_path: PB,
recorder: &FR,
) -> Result<(), RecorderError>
fn save_file<FR, PB>( self, file_path: PB, recorder: &FR, ) -> Result<(), RecorderError>
Save the module to a file using the provided file recorder. Read more
§fn load_file<FR, PB>(
self,
file_path: PB,
recorder: &FR,
device: &<B as Backend>::Device,
) -> Result<Self, RecorderError>
fn load_file<FR, PB>( self, file_path: PB, recorder: &FR, device: &<B as Backend>::Device, ) -> Result<Self, RecorderError>
Load the module from a file using the provided file recorder. Read more
§fn quantize_weights<C>(self, quantizer: &mut Quantizer<C>) -> Selfwhere
C: Calibration,
fn quantize_weights<C>(self, quantizer: &mut Quantizer<C>) -> Selfwhere
C: Calibration,
Quantize the weights of the module.
§impl<B> ModuleDisplay for MultiHeadAttention<B>where
B: Backend,
impl<B> ModuleDisplay for MultiHeadAttention<B>where
B: Backend,
§fn custom_settings(&self) -> Option<DisplaySettings>
fn custom_settings(&self) -> Option<DisplaySettings>
Custom display settings for the module. Read more
§fn custom_content(&self, content: Content) -> Option<Content>
fn custom_content(&self, content: Content) -> Option<Content>
Custom attributes for the module. Read more
§fn format(&self, passed_settings: DisplaySettings) -> String
fn format(&self, passed_settings: DisplaySettings) -> String
Formats the module with provided display settings. Read more
§impl<B> ModuleDisplayDefault for MultiHeadAttention<B>where
B: Backend,
impl<B> ModuleDisplayDefault for MultiHeadAttention<B>where
B: Backend,
Auto Trait Implementations§
impl<B> !Freeze for MultiHeadAttention<B>
impl<B> !RefUnwindSafe for MultiHeadAttention<B>
impl<B> Send for MultiHeadAttention<B>
impl<B> !Sync for MultiHeadAttention<B>
impl<B> Unpin for MultiHeadAttention<B>where
<B as Backend>::FloatTensorPrimitive: Unpin,
<B as Backend>::QuantizedTensorPrimitive: Unpin,
<B as Backend>::Device: Unpin,
impl<B> UnwindSafe for MultiHeadAttention<B>where
<B as Backend>::FloatTensorPrimitive: UnwindSafe,
<B as Backend>::QuantizedTensorPrimitive: UnwindSafe,
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§unsafe fn clone_to_uninit(&self, dst: *mut T)
unsafe fn clone_to_uninit(&self, dst: *mut T)
🔬This is a nightly-only experimental API. (
clone_to_uninit
)§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more§impl<T> Pointable for T
impl<T> Pointable for T
§impl<T> ToCompactString for Twhere
T: Display,
impl<T> ToCompactString for Twhere
T: Display,
§fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>
fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>
Fallible version of [
ToCompactString::to_compact_string()
] Read more§fn to_compact_string(&self) -> CompactString
fn to_compact_string(&self) -> CompactString
Converts the given value to a [
CompactString
]. Read more