Record

Records are how states are saved with Burn. Compared to most other frameworks, Burn has its own advanced saving mechanism that allows interoperability between backends with minimal possible runtime errors. There are multiple reasons why Burn decided to create its own saving formats.

First, Rust has serde, which is an extremely well-developed serialization and deserialization library that also powers the safetensors format developed by Hugging Face. If used properly, all the validations are done when deserializing, which removes the need to write validation code. Since modules in Burn are created with configurations, they can't implement serialization and deserialization. That's why the record system was created: allowing you to save the state of modules independently of the backend in use extremely fast while still giving you all the flexibility possible to include any non-serializable field within your module.

Why not use safetensors?

safetensors uses serde with the JSON file format and only supports serializing and deserializing tensors. The record system in Burn gives you the possibility to serialize any type, which is very useful for optimizers that save their state, but also for any non-standard, cutting-edge modeling needs you may have. Additionally, the record system performs automatic precision conversion by using Rust types, making it more reliable with fewer manual manipulations.

It is important to note that the safetensors format uses the word safe to distinguish itself from Pickle, which is vulnerable to Python code injection. On our end, the simple fact that we use Rust already ensures that no code injection is possible. If your storage mechanism doesn't handle data corruption, you might prefer a recorder that performs checksum validation (i.e., any recorder with Gzip compression).

Recorder

Recorders are independent of the backend and serialize records with precision and a format. Note that the format can also be in-memory, allowing you to save the records directly into bytes.

RecorderFormatCompression
DefaultFileRecorderFile - Named Message ParkNone
NamedMpkFileRecorderFile - Named Message ParkNone
NamedMpkGzFileRecorderFile - Named Message ParkGzip
BinFileRecorderFile - BinaryNone
BinGzFileRecorderFile - BinaryGzip
JsonGzFileRecorderFile - JsonGzip
PrettyJsonFileRecorderFile - Pretty JsonGzip
BinBytesRecorderIn Memory - BinaryNone

Each recorder supports precision settings decoupled from the precision used for training or inference. These settings allow you to define the floating-point and integer types that will be used for serialization and deserialization.

SettingFloat PrecisionInteger Precision
DoublePrecisionSettingsf64i64
FullPrecisionSettingsf32i32
HalfPrecisionSettingsf16i16

Note that when loading a record into a module, the type conversion is automatically handled, so you can't encounter errors. The only crucial aspect is using the same recorder for both serialization and deserialization; otherwise, you will encounter loading errors.

Which recorder should you use?

  • If you want fast serialization and deserialization, choose a recorder without compression. The one with the lowest file size without compression is the binary format; otherwise, the named message park could be used.
  • If you want to save models for storage, you can use compression, but avoid using the binary format, as it may not be backward compatible.
  • If you want to debug your model's weights, you can use the pretty JSON format.
  • If you want to deploy with no-std, use the in-memory binary format and include the bytes with the compiled code.

For examples on saving and loading records, take a look at Saving and Loading Models.