burn::tensor::backend::ops::attention

Function attention_fallback

pub fn attention_fallback<B>(
    query: <B as BackendTypes>::FloatTensorPrimitive,
    key: <B as BackendTypes>::FloatTensorPrimitive,
    value: <B as BackendTypes>::FloatTensorPrimitive,
    mask: Option<<B as BackendTypes>::BoolTensorPrimitive>,
    attn_bias: Option<<B as BackendTypes>::FloatTensorPrimitive>,
    options: AttentionModuleOptions,
) -> <B as BackendTypes>::FloatTensorPrimitivewhere
    B: Backend,

Expand description

Computes softmax(QKᵗ * scale) · V using separate kernels. Serves as a fallback when FlashAttention is not used.

attention_fallback

Function attention_fallback Copy item path

Function attention_fallback