Implementations of MobileNetV2 and torch.hub(mobilenet_v2)-wrapping example models.
Code example
Example of DeepMind’s Differentiable Neural Computer for partially-observable environments.
Code example
Example of how to use Tune’s support for custom training functions to implement custom training workflows.
Code example
Example of how to advance the environment through different phases (tasks) over time.
Code example
How to setup a custom Logger object in RLlib.
Code example
Example of how to output custom training metrics to TensorBoard.
Code example
How to setup a custom TFPolicy.
Code example
How to setup a custom TorchPolicy.
Code example
Example of how to use RLlib’s lower-level building blocks to implement a fully customized training workflow.
Code example
Example of how to use the exec. plan of an Algorithm to trin two different policies in parallel (also using multi-agent API).
Code example
How to run a custom Ray Tune experiment with RLlib with custom training- and evaluation phases.
Code example
Example of how to write a custom evaluation function that is called instead of the default behavior, which is running with the evaluation worker set through n episodes.
Code example
Example showing how the evaluation workers and the “normal” rollout workers can run (to some extend) in parallel to speed up training.
Code example
Example showing how to run an offline RL training job using a historic-data json file.
Code example
Example of using Ray Serve to serve RLlib models with HTTP and JSON interface
Code example
This script offers a simple workflow for 1) training a policy with RLlib first, 2) creating a new policy 3) restoring its weights from the trained one and serving the new policy via Ray Serve.
Code example
Example of how to setup n distributed Unity3D (compiled) games in the cloud that function as data collecting clients against a central RLlib Policy server learning how to play the game.
Code example
Example of online serving of predictions for a simple CartPole policy.
Code example
Example of how to externally generate experience batches in RLlib-compatible format.
Code example
Example of how to find a checkpoint after a Tuner.fit() via some custom defined criteria.
Code example
Setup RLlib to run any algorithm in (independent) multi-agent mode against a multi-agent environment.
Code example
Setup RLlib to run any algorithm in (shared-parameter) multi-agent mode against a multi-agent environment.
Code example
Example of different heuristic and learned policies competing against each other in rock-paper-scissors.
Code example
Example of the two-step game from the QMIX paper.
Code example
Example on how to use RLlib to learn in PettingZoo multi-agent environments.
Code example
Example of customizing PPO to leverage a centralized value function.
Code example
A simpler method of implementing a centralized critic by augmentating agent observations with global information.
Code example
Example of running a custom hand-coded policy alongside trainable policies.
Code example
Example of how to define weight-sharing layers between two different policies.
Code example
Example of alternating training between DQN and PPO.
Code example
Example of hierarchical training using the multi-agent API.
Code example
Example of an iterated prisoner’s dilemma environment solved by RLlib.
Code example
Example of how to setup fractional GPUs for learning (driver) and environment rollouts (remote workers).
Code example
Learning in arbitrarily nested action spaces.
Code example
Example of how to handle variable-length or parametric action spaces
Code example
How to filter raw observations coming from the environment for further processing by the Agent’s model(s).
Code example
How to use RLlib’s Repeated space to handle variable length observations.
Code example
Learning with auto-regressive action dependencies (e.g. 2 action components; distribution for 2nd component depends on the 1st component’s actually sampled value).
Code example
A General Evaluation Platform and Building Toolkit for Single/Multi-Agent Intelligence with RLlib-generated baselines.
Code example
Example of training autonomous vehicles with RLlib and CARLA simulator.
Code example
Using Graph Neural Networks and RLlib to train multiple cooperative and adversarial agents to solve the “cover the area”-problem, thereby learning how to best communicate (or - in the adversarial case - how to disturb communication).
Code example
A dense traffic simulating environment with RLlib-generated baselines.
Code example
Example of setting up a multi-agent version of GFootball with RLlib.
Code example
A multiagent AI research environment inspired by Massively Multiplayer Online (MMO) role playing games
Code example
Example of building packet classification trees using RLlib / multi-agent in a bandit-like setting.
Code example
Example of learning optimal LLVM vectorization compiler pragmas for loops in C and C++ codes using RLlib.
Code example
Example of using the multi-agent API to model several social dilemma games.
Code example
Create a custom environment and train a single agent RL using Ray 2.0 with Tune and Air.
Code example
Example of training in StarCraft2 maps with RLlib / multi-agent.
Code example
Example of optimizing mixed-autonomy traffic simulations with RLlib / multi-agent.
Code example
Working with custom Keras models in RLlib
Tutorial
Getting Started with RLlib
Video
Deep reinforcement learning at Riot Games
Blog
The Magic of Merlin - Shopify’s New ML Platform
Tutorial
Large Scale Deep Learning Training and Tuning with Ray
Blog
Griffin: How Instacart’s ML Platform Tripled in a year
Video
Predibase - A low-code deep learning platform built for scale
Blog
Building a ML Platform with Kubeflow and Ray on GKE
Video
Ray Summit Panel - ML Platform on Ray
Code example
AutoML for Time Series with Ray
Blog
Highly Available and Scalable Online Applications on Ray at Ant Group
Blog
Ray Forward 2022 Conference: Hyper-scale Ray Application Use Cases
Blog
A new world record on the CloudSort benchmark using Ray
Code example
Speed up your web crawler by parallelizing it with Ray
Tutorial
Image Classification Batch Inference with Huggingface Vision Transformer
Tutorial
Image Classification Batch Inference with PyTorch ResNet152
Tutorial
Object Detection Batch Inference with PyTorch FasterRCNN_ResNet50
Tutorial
Processing the NYC taxi dataset
Tutorial
Batch Training with Ray Data
Tutorial
Scaling OCR with Ray Data
Code example
Random Data Access (Experimental)
Tutorial
Implementing a Custom Datasource
Code example
Build Batch Prediction Using Ray
Code example
Build a Simple Parameter Server Using Ray
Code example
Simple Parallel Model Selection
Code example
Fault-Tolerant Fairseq Training
Code example
Learning to Play Pong
Code example
Asynchronous Advantage Actor Critic (A3C)
Code example
A Gentle Introduction to Ray Core by Example
Code example
Using Ray for Highly Parallelizable Tasks
Code example
Running a Simple MapReduce Example with Ray Core
Code example
Benchmark example for the PyTorch data transfer auto pipeline
Tutorial
How To Use Tune’s Scikit-Learn Adapters?
Code example
Simple example for doing a basic random and grid search.
Code example
Example of using a simple tuning function with AsyncHyperBandScheduler.
Code example
Example of using a Trainable function with HyperBandScheduler. Also uses the AsyncHyperBandScheduler.
Tutorial
Configuring and running (synchronous) PBT and understanding the underlying algorithm behavior with a simple example.
Tutorial
Example of using the function API with a PopulationBasedTraining scheduler.
Code example
Example of using the Population-based Bandits (PB2) scheduler.
Code example
Example of custom loggers and custom trial directory naming.
Code example
Tune 的基础使用
Code example
Using Search algorithms and Trial Schedulers to optimize your model.
Code example
Using Population-Based Training (PBT).
Code example
Fine-tuning Huggingface Transformers with PBT.
Code example
Logging Tune Runs to Comet ML.
Tutorial
Using Ray Serve to deploy a chatbot
Code example
Fine-tune vicuna-13b-v1.3 with DeepSpeed, PyTorch Lightning and Ray Train