Algorithms
Contents
Note
From Ray 2.6.0 onwards, RLlib is adopting a new stack for training and model customization, gradually replacing the ModelV2 API and some convoluted parts of Policy API with the RLModule API. Click here for details.
Algorithms#
The Algorithm class is the highest-level API in RLlib responsible for WHEN and WHAT of RL algorithms. Things like WHEN should we sample the algorithm, WHEN should we perform a neural network update, and so on. The HOW will be delegated to components such as RolloutWorker, etc.. It is the main entry point for RLlib users to interact with RLlib’s algorithms.
It allows you to train and evaluate policies, save an experiment’s progress and restore from
a prior saved experiment when continuing an RL run.
Algorithm is a sub-class
of Trainable
and thus fully supports distributed hyperparameter tuning for RL.
A typical RLlib Algorithm object: Algorhtms are normally comprised of
N RolloutWorker that
orchestrated via a WorkerSet object.
Each worker own its own a set of Policy objects and their NN models per worker, plus a BaseEnv instance per worker.#
Algorithm Configuration API#
The AlgorithmConfig class represents
the primary way of configuring and building an Algorithm.
You don’t use AlgorithmConfig directly in practice, but rather use its algorithm-specific
implementations such as PPOConfig, which each come
with their own set of arguments to their respective .training() method.
Constructor#
Public methods#
Configuration methods#
Getter methods#
Miscellaneous methods#
Building Custom Algorithm Classes#
Warning
As of Ray >= 1.9, it is no longer recommended to use the build_trainer() utility
function for creating custom Algorithm sub-classes.
Instead, follow the simple guidelines here for directly sub-classing from
Algorithm.
In order to create a custom Algorithm, sub-class the
Algorithm class
and override one or more of its methods. Those are in particular:
setup()get_default_config()get_default_policy_class()training_step()