High-Level Python SDK#
A Python environment offers flexibility for experimenting with LLMs, profiling them, and integrating them into Python applications. We use the Lemonade SDK to get up and running quickly.
To get started, follow these instructions.
System-level pre-requisites#
You only need to do this once per computer:
Make sure your system has the recommended Ryzen AI driver installed as described in Install NPU Drivers.
Download and install Miniconda for Windows.
Launch a terminal and call
conda init
.
Environment Setup#
To create and set up an environment, run these commands in your terminal:
conda create -n ryzenai-llm python=3.10
conda activate ryzenai-llm
pip install turnkeyml[llm-oga-hybrid]
lemonade-install --ryzenai hybrid
Validation Tools#
Now that you have completed installation, you can try prompting an LLM like this (where PROMPT
is any prompt you like).
Run this command in a terminal that has your environment activated:
lemonade -i amd/Llama-3.2-1B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid oga-load --device hybrid --dtype int4 llm-prompt --max-new-tokens 64 -p PROMPT
Each example linked in the Featured LLMs table also has example commands for validating the speed and accuracy of each model.
Python API#
You can also run this code to try out the high-level Lemonade API in a Python script:
from lemonade.api import from_pretrained
model, tokenizer = from_pretrained(
"amd/Llama-3.2-1B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid", recipe="oga-hybrid"
)
input_ids = tokenizer("This is my prompt", return_tensors="pt").input_ids
response = model.generate(input_ids, max_new_tokens=30)
print(tokenizer.decode(response[0]))
Each example linked in the Featured LLMs table also has an example script for streaming the text output of the LLM.
Next Steps#
From here, you can check out an example or any of the other Featured LLMs.
The examples pages also provide code for:
Additional validation tools for measuring speed and accuracy.
Streaming responses with the API.
Integrating the API into applications.
Launching the server interface from the Python environment.