Module attention

Module attention 

Expand description

Module with attention operations.

Functions§

attention_fallback
Computes softmax(QKᵗ * scale) · V using separate kernels. Serves as a fallback when FlashAttention is not used.