naive_attention

Function naive_attention 

pub fn naive_attention<B>(
    query: <B as Backend>::FloatTensorPrimitive,
    key: <B as Backend>::FloatTensorPrimitive,
    value: <B as Backend>::FloatTensorPrimitive,
    mask: Option<<B as Backend>::BoolTensorPrimitive>,
) -> <B as Backend>::FloatTensorPrimitive
where B: Backend,
Expand description

Computes softmax(QKᵗ / √d) · V using separate kernels. Serves as a fallback when FlashAttention is not used.