Learn how to use additional histories for complex agent training scenarios
additional_histories
feature addresses this need.
Each trajectory can contain:
messages_and_choices
sequence (the main conversation)additional_histories
, where each history contains its own messages_and_choices
and optional tools
<think>
) from previous turns in multi-turn conversations. This can interfere with training when you want the model to learn from its thinking process across all turns.
By splitting each turn into a separate history, you can preserve these tokens for training:
messages_and_choices
) is tokenized firstHistory
class structure:
Trajectory
class with additional histories: