trlEasy Playground

Transformer Reinforcement Learning: RLHF and PPO for LLMs

Getting started with trlRun locally
Install
pip install trl
Python CodeRun locally
Expected Output
# Expected output shown below
# (Run locally with: trl)

trl is a third-party package. Transformer Reinforcement Learning: RLHF and PPO for LLMs. Install with: pip install trl

Challenge

Try modifying the code above to explore different behaviors. Can you extend the example to handle a new use case?