Module attention

Expand description

Attention module

Structs§

GeneratePaddingMask: Generate a padding attention mask.
MhaCache: Cache for the Multi Head Attention layer.
MhaInput: Multihead attention forward pass input argument.
MhaOutput: Multihead attention outputs.
MultiHeadAttention: The multihead attention module as describe in the paper Attention Is All You Need.
MultiHeadAttentionConfig: Configuration to create a Multi Head Attention layer using the init function.
MultiHeadAttentionRecord: The record type for the module.
MultiHeadAttentionRecordItem: The record item type for the module.

SeqLengthOption: Defines an enumeration to specify sequence length options for padding

generate_autoregressive_mask: Generate an autoregressive attention mask.
generate_padding_mask: Generates a padding attention mask for a batch of token sequences.