VitisAI EP Model Support

VitisAI EP Model Support#

The VitisAI EP (Execution Provider) within Windows ML supports input models in the following formats.

Model Support Table#

Model Type

Support

CNN Models

  • Original float (FP32) model with automatic BF16 conversion during compilation

  • Quantized QDQ model using A8W8 configuration

Transformer Models

  • Original float (FP32) model with automatic BF16 conversion during compilation

  • Quantized QDQ model using A16W8 configuration

LLM Models (via Foundry Local)

  • Quantized and pre-compiled LLM models

  • Support for custom models through Olive recipe

Note#

  • For CNN and Transformer models, you can use either the original float model (with automatic BF16 conversion) or a quantized QDQ model. Quantization can reduce model size and improve inference performance.

  • For LLMs, Foundry Local provides pre-built models that auto-detect the NPU. Custom LLM deployment may require model preparation using the Olive recipe or Ryzen AI OGA workflow. See Windows ML LLM examples for details.

  • For model conversion and quantization options, see Model Conversion and Quantization (AI Toolkit).