Module quantization

Expand description

Tensor quantization module.

Structs§

BlockSize
Copyable block size, specialized version of SmallVec.
CalibrationRange
The observed input calibration range.
QParamTensor
A quantization parameter tensor descriptor.
QParams
The quantization tensor data parameters.
QuantScheme
Describes a quantization scheme/configuration.
QuantizationParametersPrimitive
The quantization parameters primitive.
QuantizedBytes
Quantized data bytes representation.
SymmetricQuantization
Symmetric quantization scheme.

Enums§

Calibration
Calibration method used to compute the quantization range mapping.
QuantAcc
The precision of accumulating elements.
QuantLevel
Level or granularity of quantization.
QuantMode
Strategy used to quantize values.
QuantParam
Quantization floating-point precision.
QuantPropagation
Specify if the output of an operation is quantized using the scheme of the input or returned unquantized.
QuantStore
Data type used to stored quantized values.
QuantValue
Data type used to represent quantized values.
QuantizationStrategy
Quantization strategy.

Traits§

QTensorPrimitive
Quantized tensor primitive.
Quantization
Quantization scheme to convert elements of a higher precision data type E to a lower precision data type Q and vice-versa.

Functions§

compute_q_params
Compute the quantization parameters.
compute_range
Compute the quantization range mapping.
pack_i8s_to_u32s
Pack signed 8-bit integer values into a sequence of unsigned 32-bit integers.
params_shape
Calculate the shape of the quantization parameters for a given tensor and level

Type Aliases§

QuantizationParameters
The tensor quantization parameters.