Seldon Python 服务配置
==================================

要部署组件，Seldon Python 封装将会默认使用
`Gunicorn <https://gunicorn.org/>`__。 Gunicorn
是一个 Unix 下的高性能 HTTP 服务器。他允许你轻易的
通过多个处理器和进程来缩放你的模型。

.. Note:: 
  Gunicorn 只能处理纵向缩放模型
  **在同一个 pod 容器中**。要学习更多
  通过多个 pod 副本缩放模型查看
  `次章节 <../graph/scaling>`_ 文档。

Workers
-------

通常，Seldon 使用 **单个 worker 处理**。然而，
可通过增加 ``GUNICORN_WORKERS``
环境变量调整 REST 以及 ``GRPC_WORKERS`` 环境变量调整 GRPC。
变量可直接通过 ``SeldonDeployment`` CRD 控制。

例如，使用 8 个处理模型 (4 RESt and 4 GRPC)，你可以这么做：

.. code:: yaml
    :emphasize-lines: 14-17

    apiVersion: machinelearning.seldon.io/v1
    kind: SeldonDeployment
    metadata:
      name: gunicorn
    spec:
      name: worker
      predictors:
      - componentSpecs:
        - spec:
            containers:
            - image: seldonio/mock_classifier:1.0
              name: classifier
              env:
              - name: GUNICORN_WORKERS
                value: '4'
              - name: GRPC_WORKERS
                value: '4'
            terminationGracePeriodSeconds: 1
        graph:
          children: []
          endpoint:
            type: REST
          name: classifier
          type: MODEL
        labels:
          version: v1
        name: example
        replicas: 1


通过禁用 GRPC 服务器仅运行 REST 服务器
-------------------------------------------------------

默认 Seldon 为 REST 和 GRPC 服务都运行一个单独的处理器。
如果在每个进程中加载​​机器学习模型，则在模型工件非常大的
情况下，这可能会导致大量开销，因为
将为每个工作人员加载模型的实例。对于这种情况，可以通过
设置 ``GRPC_WORKERS`` 为 ``0`` 为 来禁用 GRPC 服务器，这最终不会启动 GRPC 服务器。
需要注意的是，GRPC 端点在服务编排器中仍然可用，
因此 GRPC 请求将不再有效。这方面的一个例子如下：

.. code:: yaml
    :emphasize-lines: 14-15

    apiVersion: machinelearning.seldon.io/v1
    kind: SeldonDeployment
    metadata:
      name: gunicorn
    spec:
      name: worker
      predictors:
      - componentSpecs:
        - spec:
            containers:
            - image: seldonio/mock_classifier:1.0
              name: classifier
              env:
              - name: GRPC_WORKERS
                value: '0'
            terminationGracePeriodSeconds: 1
        graph:
          children: []
          endpoint:
            type: REST
          name: classifier
          type: MODEL
        labels:
          version: v1
        name: example
        replicas: 1


线程
-------

默认情况下，Seldon 将使用
**每个工作进程 10 个线程池** 来处理模型的传入请求。你可以通过
设置 ``GUNICORN_THREADS`` 环境变量增加这个数字。这个环境变量可以
直接通过 ``SeldonDeployment`` CRD 进行控制。

例如，要使用每个工作线程 5 个线程运行您的模型，您可以执行以下操作：

.. code:: yaml
    :emphasize-lines: 14-15


    apiVersion: machinelearning.seldon.io/v1
    kind: SeldonDeployment
    metadata:
      name: gunicorn
    spec:
      name: worker
      predictors:
      - componentSpecs:
        - spec:
            containers:
            - image: seldonio/mock_classifier:1.0
              name: classifier
              env:
              - name: GUNICORN_THREADS
                value: '5'
            terminationGracePeriodSeconds: 1
        graph:
          children: []
          endpoint:
            type: REST
          name: classifier
          type: MODEL
        labels:
          version: v1
        name: example
        replicas: 1

禁用多线程
~~~~~~~~~~~~~~~~~~~~~~

在某些情况下，您可能希望完全禁用多线程。要在
单个线程中服务您的模型，请将环境变量设置 
``FLASK_SINGLE_THREADED`` 为 1。对于大多数模型
来说，这不是最佳设置，但当您的模型无法像许多基于 GPU 的模型
那样在访问时因线程安全出现死锁时
将非常有用。

.. code:: yaml
    :emphasize-lines: 14-15

    apiVersion: machinelearning.seldon.io/v1alpha2
    kind: SeldonDeployment
    metadata:
      name: flaskexample
    spec:
      name: worker
      predictors:
      - componentSpecs:
        - spec:
            containers:
            - image: seldonio/mock_classifier:1.0
              name: classifier
              env:
              - name: FLASK_SINGLE_THREADED
                value: '1'
            terminationGracePeriodSeconds: 1
        graph:
          children: []
          endpoint:
            type: REST
          name: classifier
          type: MODEL
        labels:
          version: v1
        name: example
        replicas: 1

开发服务器
------------------

虽然建议将 Gunicorn 用于生产工作负载，
但也可以使用 Flask 的内置开发服务器。要启用开发服务器，
您可以将 ``SELDON_DEBUG`` 为 ``1``。

.. code:: yaml
    :emphasize-lines: 14-15

    apiVersion: machinelearning.seldon.io/v1
    kind: SeldonDeployment
    metadata:
      name: flask-development-server
    spec:
      name: worker
      predictors:
      - componentSpecs:
        - spec:
            containers:
            - image: seldonio/mock_classifier:1.0
              name: classifier
              env:
              - name: SELDON_DEBUG
                value: '1'
            terminationGracePeriodSeconds: 1
        graph:
          children: []
          endpoint:
            type: REST
          name: classifier
          type: MODEL
        labels:
          version: v1
        name: example
        replicas: 1

配置
-------------

Python Server 可以使用环境变量或
命令行进行配置。


+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CLI 标识                    | 环境变量                                   | 默认            | 笔记                                                                                                                                                                             |
+=============================+============================================+=================+==================================================================================================================================================================================+
| ``interface_name``          | N/A                                        | N/A             | 第一个必须的选项。如果包含 ``.`` 第一部分将作为模块名。                                                                                                                          |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--http-port``             | ``PREDICTIVE_UNIT_HTTP_SERVICE_PORT``      | ``9000``        | Seldon 服务的 Http 端口。在 k8s 中由 Seldon Core Operator 控制。                                                                                                                 |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--grpc-port``             | ``PREDICTIVE_UNIT_GRPC_SERVICE_PORT``      | ``5000``        | Seldon 服务的 Grpc 端口。在 k8s 中由 Seldon Core Operator 控制。                                                                                                                 |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--metrics-port``          | ``PREDICTIVE_UNIT_METRICS_SERVICE_PORT``   | ``6000``        | Seldon 服务的 指标 端口。在 k8s 中由 Seldon Core Operator 控制。                                                                                                                 |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--service-type``          | N/A                                        | ``MODEL``       | 模型服务类型。可能是 ``MODEL``、``ROUTER``、``TRANSFORMER``、``COMBINER`` 或 ``OUTLIER_DETECTOR``。                                                                              |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--parameters``            | N/A                                        | ``[]``          | 传入模型类的参数列表                                                                                                                                                             |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--log-level``             | ``LOG_LEVEL_ENV``                          | ``INFO``        | Python 日志等级，可能是 ``DEBUG``、 ``INFO``、 ``WARNING`` 或 ``ERROR``。                                                                                                        |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--debug``                 | ``SELDON_DEBUG``                           | ``false``       | 开启 ``flask`` 开发者服务模式并且设置日志为 ``DEBUG``。值为 ``1``、 ``true`` 或 ``t`` (大小写不敏感) 将视作 ``True``。                                                           |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--tracing``               | ``TRACING``                                | ``0``           | 开启追踪。可能是 ``0`` 或 ``1``。                                                                                                                                                |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--workers``               | ``GUNICORN_WORKERS``                       | ``1``           | 处理请求的 Gunicorn workers 数量。                                                                                                                                               |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--threads``               | ``GUNICORN_THREADS``                       | ``10``          | 处理请求的 Gunicorn 线程数量。                                                                                                                                                   |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--max-requests``          | ``GUNICORN_MAX_REQUESTS``                  | ``0``           | gunicorn worker 重启前处理的最大请求量                                                                                                                                           |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--max-requests-jitter``   | ``GUNICORN_MAX_REQUESTS_JITTER``           | ``0``           | 要添加到 max-request 的最大随机抖动                                                                                                                                              |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--keepalive``             | ``GUNICORN_KEEPALIVE``                     | ``2``           | 在 Keep-Alive 连接上等待请求的秒数。                                                                                                                                             |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--access-log``            | ``GUNICORN_ACCESS_LOG``                    | ``false``       | 启用 gunicorn 访问日志。                                                                                                                                                         |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--pidfile``               | N/A                                        | None            | 用于 Gunicorn PID 文件的文件路径。                                                                                                                                               |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``--single-threaded``       | ``FLASK_SINGLE_THREADED``                  | ``0``           | 强制 Flask 应用程序运行单线程。也适用于 Gunicorn。可以是 ``0`` 或 ``1``.                                                                                                         |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| N/A                         | ``FILTER_METRICS_ACCESS_LOGS``             | ``not debug``   | 过滤掉与 Prometheus 访问指标端口相关的日志。默认情况下在生产中启用并在调试模式下禁用。                                                                                           |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| N/A                         | ``PREDICTIVE_UNIT_METRICS_ENDPOINT``       | ``/metrics``    | Prometheus 指标的端点名称。在 k8s 部署中默认为 ``/prometheus``。                                                                                                                 |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| N/A                         | ``PAYLOAD_PASSTHROUGH``                    | ``false``       | 跳过有效载荷的解码。                                                                                                                                                             |
+-----------------------------+--------------------------------------------+-----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+