Serverless Models
We currently only support the following model for serverless training. We are actively adding support for both larger and smaller models. If there’s a particular model you’d like to see serverless support for, please send a request to support@wandb.com.- OpenPipe Qwen 3 14B Instruct
- Good balance of performance and size. Has support for tool calling and generally trains well. This is our recommended model for users new to RL.
- Qwen 3 30B A3B Instruct
- More capable than 14B while still being efficient. Good choice when you need stronger reasoning capabilities.
Recommended Local Models
If you’re developing locally or in your own hardware, here are a couple other models you could try in addition to the recommended serverless list.- Qwen2.5 7B Instruct
- Less capable than 14B, but smaller and faster
- Qwen2.5 32B Instruct
- More capable than 14B, but larger and slower
More Models
ART has wide support for models supported by vLLM. However, not all models support all features. For instance, if a model’s chat template does not include tool call support, you won’t be able to use tools with it natively. And if a model’s architecture doesn’t have support for LoRA layers, it won’t be compatible with our LoRA-based backend, but still may work with our full-fine-tuning backend. Here are additional models that we’ve tested and found to work well with ART:- Llama 3.1 8B Instruct
- Llama 3.2 1B Instruct
- Llama 3.2 3B Instruct
- Llama 3.3 70B Instruct
- Qwen2.5 72B Instruct
- Additionally, the Qwen 3 family of models is well supported for single-turn workflows. For multi-turn workflows the Qwen 3 chat template removes the
<think>tokens from previous turns, which makes training more complicated. It is still possible to use for multi-turn workflows by splitting each turn into a separate message history with ouradditional_historiestrajectory parameter (see Additional Histories).