Function attention_fallback
pub fn attention_fallback<B>(
query: <B as BackendTypes>::FloatTensorPrimitive,
key: <B as BackendTypes>::FloatTensorPrimitive,
value: <B as BackendTypes>::FloatTensorPrimitive,
mask: Option<<B as BackendTypes>::BoolTensorPrimitive>,
attn_bias: Option<<B as BackendTypes>::FloatTensorPrimitive>,
options: AttentionModuleOptions,
) -> <B as BackendTypes>::FloatTensorPrimitivewhere
B: Backend,Expand description
Computes softmax(QKᵗ * scale) · V using separate kernels. Serves as a fallback when FlashAttention is not used.