ART Client
Integrate RL into existing codebases.
One of ART’s primary goals is to minimize the amount of setup necessary to begin benefitting from RL within an existing codebase. The ART client is a lightweight object that allows you to run inference and train models against either local or remote backends. That means that you can run your agent anywhere, including on a laptop without a powerful GPU, and still get all the performance benefits of training and generating tokens on a H100. Pretty cool!
If you’re curious about how ART allows you to run training and inference either remotely or locally depending on your development machine, check out the backend docs below. Otherwise, let’s dive deeper into the client!
Initializing the client
The client that you’ll use to generate tokens and train your model is initialized through the art.TrainableModel
class.
Once you’ve initialized your backend, you can register it with your model. This sets up all the wiring to run inference and training.
You’re now ready to start training your agent.
Running inference
Your model will generate inference tokens by making requests to a vLLM server running on whichever backend you previously registered. To route inference requests to this backend, follow the code sample below.
As your model learns to become more capable at the task, its weights will update and each new LoRA instance will be automatically loaded onto the vLLM server running on your backend. The registration and inference process shown above will ensure that your inference requests are always routed to the latest version of the model, saving you a lot of complexity!
Training the model
Before training your model, you need to provide a few scenarios that your agent should learn from. While completing these scenarios, its weights will update to avoid past mistakes and reproduce successes. It’s best to provide at least 10 scenarios that adequately represent the real scenarios your agent will handle after it’s deployed.
Define a rollout function that runs the agent through an individual scenario.
Now that your training scenarios and rollout function are declared, training the model is straightforward. The following code trains the model for 50 steps, allowing the agent 8 attempts at each training scenario during each step. Since a reward is assigned to each Trajectory that the rollout function returns, the agent will learn to produce completions that are more similar to those that resulted in high rewards in the past, and will shy away from behavior that resulted in low rewards.
To see the ART client and backend working together in action, check out our Summarizer tutorial or one of the notebooks! If you have questions on how to integrate the ART client into your own codebase, please ask in the Discord!