Vitis AI Quantizer for Olive#


Ensure that Olive is correctly installed. For more information, see Olive installation instructions.

Describing the Model#

Olive requires information about your model, such as loading instructions, the names and shapes of input tensors, target hardware selection, and a list of optimizations you want to perform on the model. You can provide this information in a JSON file as input to Olive. For more details on using Olive and creating the Olive configuration file, refer to the Microsoft Olive Documentation.

Configuring the Quantization Pass#

The JSON configuration file must include a passes key, which is a dictionary containing information about passes executed by the engine. The passes are executed in the order defined within the dictionary, where the key of the dictionary represents the name of the pass.

To quantize the model for Ryzen AI, use the VitisAIQuantization pass. In the following example, two passes are used, converting to ONNX and quantizing using Vitis AI.

"passes": {
    "onnx_conversion": {
        "type": "OnnxConversion",
        "config": {
            "target_opset": 13
        "host": "local_system",
        "evaluator": "common_evaluator"
    "vitis_ai_quantization": {
        "type": "VitisAIQuantization",
        "disable_search": true,
        "config": {
            "user_script": "",
            "data_dir": "data",
            "dataloader_func": "resnet_calibration_reader"
        "clean_run_cache": false

Note: The target_opset configuration of onnx_conversion pass must be above 10.

For a complete description of the VitisAIQuantization pass, refer to the VitisAIQuantization pass reference guide.

Checking the Configuration#

Before running quantization with Olive, you can optionally execute a setup mode. This helps identify additional packages that might need to be installed to support the passes set in the configuration JSON file.

python -m --config resnet_static_config.json --setup

Quantizing the Model#

To quantize the model, run Olive with the JSON configuration file as follows:

python -m --config resnet_static_config.json

Here is the typical output:

[2023-05-29 01:03:07,086] [WARNING] [] No accelerators specified for target system. Using CPU.
[2023-05-29 01:03:07,098] [DEBUG] [] Resolving goals: {'accuracy': {<AccuracySubType.ACCURACY_SCORE: 'accuracy_score'>:     MetricGoal(type='max-degradation', value=0.01)}, 'latency': {'avg': MetricGoal(type='percent-min-improvement', value=20.0)}}
[2023-05-29 01:03:07,101] [DEBUG] [] Computing baseline for metrics ...
[2023-05-29 01:03:07,101] [DEBUG] [] Evaluating model ...
[2023-05-29 01:03:11,740] [DEBUG] [] There is no goal set for metric: {metric_name}.
[2023-05-29 01:03:11,740] [DEBUG] [] There is no goal set for metric: {metric_name}.
[2023-05-29 01:03:11,741] [DEBUG] [] Baseline: {'accuracy-accuracy_score': 0.8729838728904724, 'latency-avg': 31.98742}
[2023-05-29 01:03:11,741] [DEBUG] [] Resolved goals: {'accuracy-accuracy_score': 0.8629838728904724, 'latency-avg': 25.589936}
[2023-05-29 01:03:11,743] [DEBUG] [] Step 1 with search point {'OnnxConversion': {}, 'VitisAIQuantization': {}} ...
[2023-05-29 01:03:11,743] [DEBUG] [] Running pass OnnxConversion
============== Diagnostic Run torch.onnx.export version 2.0.1+cpu ==============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

[2023-05-29 01:03:12,689] [DEBUG] [] Running pass VitisAIQuantization
[2023-05-29 01:03:12,691] [INFO] [] Preprocessing model for quantization
Finding optimal threshold for each tensor using PowerOfTwoMethod.MinMSE algorithm ...
[2023-05-29 01:03:53,389] [DEBUG] [] Evaluating model ...
[2023-05-29 01:03:58,156] [DEBUG] [] Signal: {'accuracy-accuracy_score': 0.8145161271095276, 'latency-avg': 28.5457}
[2023-05-29 01:03:58,157] [WARNING] [] No models in this search group ['OnnxConversion', 'VitisAIQuantization'] met the   goals. Sorting the models without applying goals...
[2023-05-29 01:03:58,159] [INFO] [] pareto frontier points: 1_VitisAIQuantization-0-5eced571581e0d511ed3467faeee47b8-cpu-cpu   {'accuracy-accuracy_score': 0.8145161271095276, 'latency-avg': 28.5457}
[2023-05-29 01:03:58,159] [INFO] [] Output all 1 models
[2023-05-29 01:03:58,161] [INFO] [] No packaging config provided, skip packaging artifacts

At the end of the Quantization process, the model is saved in the [model].onnx format.