Burn 0.15.0 Release Notes
Mon Oct 28 2024
Guillaume LagrangeOverview
This release brings major performance improvements to tensor operations, particularly in matrix multiplication and convolution, along with experimental ROCm/HIP and SPIR-V support enabled by CubeCL runtimes. It also introduces foundational features for multi-backend compatibility and adds new quantization operations.
Support for ONNX models has been expanded, with additional operators and bug fixes for better operator coverage.
As with previous releases, this version includes various bug fixes, further performance optimizations, new tensor operations, and enhanced documentation.
Module & Tensor
• Add deform_conv2d as implemented in torchvision
#2147@wingertge
• Add Softmin
#2358@NoahSchiro
• Make tensor sync
#2392@kingwingfly
• [Breaking] Change LR schedulers to return the initial LR at first `.step()`
#2337@towerpark
• Move LrSchedule generic to make it easier to use
#2309@ArthurBrussee
Bug Fixes
Backends
• Add candle `CudaDevice` and `MetalDevice` to avoid creating a new unique device each time
#2290@laggui
• Add fusion mix precision
#2247@nathanielsimard
• Add SPIR-V compiler backend to `burn-wgpu`
#2386@wingertge
Bug Fixes
• Fix autodiff memory leak
#2347@nathanielsimard
• Fix autodiff abs NaN when output is 0
#2249@AsherJingkongChen
Documentation & Examples
• Add documentation for custom `cubecl` kernels, update some outdated docs
#2404@wingertge
• Add comments to burn fusion
#2130@cBournhonesque
• Improve doc for burn-tch
#2288@kingwingfly
• Enable doc_auto_cfg to show feature-req-hint in docs.rs
#2271@kingwingfly
Fixes
• Fix huber loss documentation
#2232@kingwingfly
• Fixed raspberry pi pico example not compiling
#2220@BjornTheProgrammer
• Fixed path in book
#2262@mehmetalianil
• Contributor Book: Fix the link of primitive types in the "Serialization" page
#2362@towerpark
• Fix simple regression batch targets
#2379@wangjiawen2013
ONNX Support
• Add gather support for multi-dim indices (rank > 1)
#2199@alteredoxide
• simplify scope tracking in burn-import
#2207@skewballfox
Enhancements
• Improve slice kernel performance
#2252@nathanielsimard
• Fix burn-jit conv2d excessive loop unrolling
#2263@AsherJingkongChen
• Introduce autotuning to `conv2d` and `conv_transpose2d` with a new `im2col`/`GEMM` algorithm
#2287@wingertge
• Further data locality optimizations for implicit GEMM
#2300@wingertge
• Add utility methods to split gradients to GradientParams
#2311@ArthurBrussee
• Add bounds checking to implicit GEMM to allow arbitrary input shapes
#2354@wingertge
• Initialize accumulator to bias for implicit GEMM to save an expensive `float_add`
#2383@wingertge
Refactoring
• Select kernel from CPA to CubeCL
#2168@mepatrick73
• Migrate cubecl macro
#2266@wingertge
• Refactor elemwise fusion
#2344@nathanielsimard
• Refactor Adaptive Avg Pool to CubeCL
#2351@nathanielsimard
• Refactor pooling kernels
#2356@nathanielsimard
• Refactor burn-tensor: Split conv backward ops to allow conditional gradient computation
#2278@AsherJingkongChen
Miscellaneous
• Fix panic messages being invisible in tui mode
#2226@PaulWagener
• Set MSRV to 1.81
#2388@nathanielsimard
• Don't panic when the progress is > 1.0
#2229@PaulWagener
• Fix compile for dataset crate with vision feature
#2228@PaulWagener
• Update rusqlite and associated libraries
#2328@paulirotta
• Fix missing fusion feature flag
@nathanielsimard
• Add should_run for convs instead of panicking
#2403@ArthurBrussee
• Add Windows/WindowsIterator/WindowsDataset
#2338@NicoZweifel
References
[1]Github Release Page