Function naive_attention
pub fn naive_attention<B>(
query: <B as Backend>::FloatTensorPrimitive,
key: <B as Backend>::FloatTensorPrimitive,
value: <B as Backend>::FloatTensorPrimitive,
mask: Option<<B as Backend>::BoolTensorPrimitive>,
) -> <B as Backend>::FloatTensorPrimitivewhere
B: Backend,Expand description
Computes softmax(QKᵗ / √d) · V using separate kernels. Serves as a fallback when FlashAttention is not used.