Model Table#

No	Model Name	Hybrid	NPU-only
1	Llama-2-7b-chat-hf	✓	✓
2	Llama-2-7b-hf	✓	✓
3	Meta-Llama-3-8B	✓	✓
4	Llama-3.1-8B	✓	✓
5	Meta-Llama-3.1-8B-Instruct	✓	✓
6	Llama-3.2-1B	✓	✓
7	Llama-3.2-1B-Instruct	✓	✓
8	Llama-3.2-3B	✓
9	Llama-3.2-3B-Instruct	✓
10	CodeLlama-7b-Instruct-hf	✓	✓
11	DeepSeek-R1-Distill-Llama-8B	✓	✓
12	DeepSeek-R1-Distill-Qwen-1.5B	✓	✓
13	Qwen-2.5-1.5B-Instruct	✓	✓
14	DeepSeek-R1-Distill-Qwen-7B	✓	✓
15	Phi-3-mini-4k-instruct	✓	✓
16	Phi-3-mini-128k-instruct	✓	✓
17	Phi-3.5-mini-instruct	✓	✓
18	Phi-4-mini-instruct	✓
19	Phi-4-mini-reasoning	✓
20	gemma-2-2b	✓
21	Mistral-7B-Instruct-v0.1	✓	✓
22	Mistral-7B-Instruct-v0.2	✓	✓
23	Mistral-7B-Instruct-v0.3	✓	✓
24	Mistral-7B-v0.3	✓	✓
25	AMD-OLMo-1B-SFT-DPO	✓
26	chatglm3-6b	✓	✓
27	Qwen1.5-7B-Chat	✓	✓
28	Qwen2-1.5B	✓	✓
29	Qwen2-7B	✓	✓
30	Qwen2.5-0.5B-Instruct	✓
31	Qwen2.5-7B-Instruct	✓	✓
32	Qwen2.5-Coder-0.5B-Instruct	✓
33	Qwen2.5-Coder-1.5B-Instruct	✓	✓
34	Qwen2.5-Coder-7B-Instruct	✓	✓
35	Qwen2.5-3B-Instruct	✓	✓
36	Qwen3-1.7B	✓
37	Qwen3-4B	✓
38	Qwen3-8B	✓

Notes#

All models are supported up to 4K context length, with the following exceptions: