在 RayCluster 上开发 Ray Serve Python 脚本
Contents
在 RayCluster 上开发 Ray Serve Python 脚本#
在本教程中,您将学习如何针对 RayCluster 有效调试 Ray Serve 脚本,与直接使用 RayService 开发脚本相比,实现增强的可观察性和更快的迭代速度。许多 RayService 问题都与 Ray Serve Python 脚本相关,因此在将脚本部署到 RayService 之前确保脚本的正确性非常重要。本教程将向您展示如何为 RayCluster 上的 MobileNet 图像分类器开发 Ray Serve Python 脚本。您可以在本地 Kind 集群上部署并提供分类器,而无需 GPU。有关更多详细信息,请参阅 ray-service.mobilenet.yaml 和 mobilenet-rayservice.md 。
步骤 1: 安装 KubeRay 集群#
按照 本文档 通过 Helm 存储库安装最新的稳定 KubeRay Operator。
步骤 2: 创建 RayCluster CR#
helm install raycluster kuberay/ray-cluster --version 1.0.0-rc.0
步骤 3: 登录head Pod#
export HEAD_POD=$(kubectl get pods --selector=ray.io/node-type=head -o custom-columns=POD:metadata.name --no-headers)
kubectl exec -it $HEAD_POD -- bash
步骤 4: 准备 Ray Serve Python 脚本并运行 Ray Serve 应用程序#
# Execute the following command in the head Pod
git clone https://github.com/ray-project/serve_config_examples.git
cd serve_config_examples
# Try to launch the Ray Serve application
serve run mobilenet.mobilenet:app
# [Error message]
# from tensorflow.keras.preprocessing import image
# ModuleNotFoundError: No module named 'tensorflow'
serve run mobilenet.mobilenet:app: 第一个mobilenet是在目录serve_config_examples/中的名字, 第二个mobilenet是目录mobilenet/中的 Python 文件的名称,app是 Python 文件中代表 Ray Serve 应用程序的变量的名称。 有关更多详细信息,请参阅 “import_path” rayservice-troubleshooting.md 部分。
步骤 5: 修改 Ray 镜像 rayproject/ray:${RAY_VERSION} 为 rayproject/ray-ml:${RAY_VERSION}#
# Uninstall RayCluster
helm uninstall raycluster
# Install the RayCluster CR with the Ray image `rayproject/ray-ml:${RAY_VERSION}`
helm install raycluster kuberay/ray-cluster --version 1.0.0-rc.0 --set image.repository=rayproject/ray-ml
步骤 4 的错误信息表明 Ray 镜像 rayproject/ray:${RAY_VERSION} 没有TensorFlow包。
由于 TensorFlow 的规模很大,我们选择使用以 TensorFlow 为基础的映像,而不是将通过 Runtime Environments 安装。
此步骤,我们将修改 Ray 镜像 rayproject/ray:${RAY_VERSION} 为 rayproject/ray-ml:${RAY_VERSION}。
步骤 6: 重复步骤 3 和 4#
# Repeat 步骤 3 and 步骤 4 to log in to the new head Pod and run the Ray Serve application.
# You should successfully launch the Ray Serve application this time.
serve run mobilenet.mobilenet:app
# [Example output]
# (ServeReplica:default_ImageClassifier pid=139, ip=10.244.0.8) Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5
# 8192/14536120 [..............................] - ETA: 0s)
# 4202496/14536120 [=======>......................] - ETA: 0s)
# 12902400/14536120 [=========================>....] - ETA: 0s)
# 14536120/14536120 [==============================] - 0s 0us/step
# 2023-07-17 14:04:43,737 SUCC scripts.py:424 -- Deployed Serve app successfully.
步骤 7: 向 Ray Serve 应用程序提交请求#
# (On your local machine) Forward the serve port of the head Pod
kubectl port-forward --address 0.0.0.0 $HEAD_POD 8000
# Clone the repository on your local machine
git clone https://github.com/ray-project/serve_config_examples.git
cd serve_config_examples/mobilenet
# Prepare a sample image file. `stable_diffusion_example.png` is a cat image generated by the Stable Diffusion model.
curl -O https://raw.githubusercontent.com/ray-project/kuberay/master/docs/images/stable_diffusion_example.png
# Update `image_path` in `mobilenet_req.py` to the path of `stable_diffusion_example.png`
# Send a request to the Ray Serve application.
python3 mobilenet_req.py
# [Error message]
# Unexpected error, traceback: ray::ServeReplica:default_ImageClassifier.handle_request() (pid=139, ip=10.244.0.8)
# File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/_private/utils.py", line 254, in wrap_to_ray_error
# raise exception
# File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/_private/replica.py", line 550, in invoke_single
# result = await method_to_call(*args, **kwargs)
# File "./mobilenet/mobilenet.py", line 24, in __call__
# File "/home/ray/anaconda3/lib/python3.7/site-packages/starlette/requests.py", line 256, in _get_form
# ), "The `python-multipart` library must be installed to use form parsing."
# AssertionError: The `python-multipart` library must be installed to use form parsing..
需要 python-multipart 来解析 starlette.requests.form() 函数,所以当我们向Ray Serve应用发送请求时,会报错信息。
步骤 8: 使用运行时环境重新启动 Ray Serve 应用程序。#
# In the head Pod, stop the Ray Serve application
serve shutdown
# Check the Ray Serve application status
serve status
# [Example output]
# There are no applications running on this cluster.
# Launch the Ray Serve application with runtime environment.
serve run mobilenet.mobilenet:app --runtime-env-json='{"pip": ["python-multipart==0.0.6"]}'
# (On your local machine) Submit a request to the Ray Serve application again, and you should get the correct prediction.
python3 mobilenet_req.py
# [Example output]
# {"prediction": ["n02123159", "tiger_cat", 0.2994779646396637]}
步骤 9: 创建 RayService YAML 文件#
在前面的步骤中,我们发现使用 Ray 镜像 rayproject/ray-ml:${RAY_VERSION} 和 runtime environments python-multipart==0.0.6可以成功启动Ray Serve应用程序。
因此,我们可以创建一个具有相同 Ray 镜像和运行环境的 RayService YAML 文件。
更多详情请参考 ray-service.mobilenet.yaml 和 mobilenet-rayservice.md。