serving: Unexpectedly high CPU utilization of Docker container; even without models.
System Information
- Host OS: Microsoft Windows 10 Pro (Version 10.0.19041 Build 19041)
- TensorFlow Serving installed from Docker: tensorflow/serving:latest-devel and Ubuntu:18.04
- Docker Desktop: 2.3.0.4 (46911) (using Linux containers)
The Problem
TensorFlow Servings docker container uses significantly more CPU than expected/desirable.
Evaluation
docker stats shows the following:
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
0e5fca607055 linguistic-intelligence_tensorflow-servings_1 14.75% 24.26MiB / 12.43GiB 0.19% 1.19kB / 0B 0B / 0B 29
Notice the container is reported as using 14.75% CPU. This is after running for 15 minutes.
And here’s the issue: This is with zero models loaded. Here, the sustained CPU utilization is around 6-17%. (If models had been loaded the sustained CPU utilization is around 25%.)
Expected Behaviour:
Of course, since TSF has been delegated zero work to do, it’s expected that CPU utilization would be minimal.
Exact Steps to Reproduce
Running the Docker Container:
docker-compose up tensorflow-servings
docker-compose.yaml:
tensorflow-servings:
build:
context: .
dockerfile: tensorflow-servings.dockerfile
image: tensorflow-servings
ports:
- 8500:8500
- 8501:8501
volumes:
- ./ml-models/:/models/:ro
- ./ml-servings-config/:/config/:ro
deploy:
resources:
limits:
memory: 4Gb
environment:
- MODEL_CONFIG_FILE_POLL_WAIT_SECONDS=0
tensorflow-servings.dockerfile:
ARG TF_SERVING_VERSION=latest
ARG TF_SERVING_BUILD_IMAGE=tensorflow/serving:${TF_SERVING_VERSION}-devel
FROM ${TF_SERVING_BUILD_IMAGE} as build_image
FROM ubuntu:18.04
ARG TF_SERVING_VERSION_GIT_BRANCH=master
ARG TF_SERVING_VERSION_GIT_COMMIT=head
LABEL tensorflow_serving_github_branchtag=${TF_SERVING_VERSION_GIT_BRANCH}
LABEL tensorflow_serving_github_commit=${TF_SERVING_VERSION_GIT_COMMIT}
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
&& \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Install TF Serving pkg
COPY --from=build_image /usr/local/bin/tensorflow_model_server /usr/bin/tensorflow_model_server
# PORTS:
# gRPC
EXPOSE 8500
# REST
EXPOSE 8501
# Set where models should be stored in the container
ENV MODEL_BASE_PATH=/models/
# Embed the models
COPY ./ml-models /models
COPY ./ml-servings-config /config
# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n\
tensorflow_model_server \
--allow_version_labels_for_unavailable_models=true \
--batching_parameters_file=/config/batching_parameters.txt \
--enable_batching=true \
--model_base_path=${MODEL_BASE_PATH} \
--model_config_file=/config/all_models.config \
--model_config_file_poll_wait_seconds=60 \
--monitoring_config_file=/config/monitoring_config.txt \
--port=8500 \
--rest_api_port=8501 \
--rest_api_timeout_in_ms=30000 \
"$@"' > /usr/bin/tf_serving_entrypoint.sh \
&& chmod +x /usr/bin/tf_serving_entrypoint.sh
ENTRYPOINT ["/usr/bin/tf_serving_entrypoint.sh"]
all_models.config:
model_config_list {
}
Logs: When starting the container:
$ docker-compose up tensorflow-servings
WARNING: Some services (natural-language-server, tensorflow-servings) use the 'deploy' key, which will be ignored. Compose does not support 'deploy' configuration - use `docker stack deploy` to deploy to a swarm.
Creating linguistic-intelligence_tensorflow-servings_1 ... done
Attaching to linguistic-intelligence_tensorflow-servings_1
tensorflow-servings_1 | 2020-09-07 09:27:05.197858: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:27:05.204117: I tensorflow_serving/model_servers/server.cc:367] Running gRPC ModelServer at 0.0.0.0:8500 ...
tensorflow-servings_1 | 2020-09-07 09:27:05.207537: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | [warn] getaddrinfo: address family for nodename not supported
tensorflow-servings_1 | [evhttp_server.cc : 238] NET_LOG: Entering the event loop ...
tensorflow-servings_1 | 2020-09-07 09:27:05.215164: I tensorflow_serving/model_servers/server.cc:387] Exporting HTTP/REST API at:localhost:8501 ...
tensorflow-servings_1 | 2020-09-07 09:28:05.205505: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:29:05.206871: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:30:05.206192: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:31:05.206607: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:32:05.207374: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:33:05.207560: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:34:05.207728: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:35:05.206796: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:36:05.213456: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:37:05.207654: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:38:05.208601: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:39:05.208775: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:40:05.209190: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:41:05.209411: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:42:05.210525: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:43:05.210102: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
tensorflow-servings_1 | 2020-09-07 09:44:05.210034: I tensorflow_serving/model_servers/server_core.cc:464] Adding/updating models.
batching_parameters.txt:
max_batch_size { value: 64 }
batch_timeout_micros { value: 0 }
num_batch_threads { value: 4 }
pad_variable_length_inputs: true
Thanks for your attention.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 15 (5 by maintainers)
I try to find the cause to solve this, here is something I found
perfshows that the serving binary use cpu to call some funtion in stdc++and
gdbshows the function nameThe call stack seems
kTimeoutMillis is 1msinlinenot sureI cannot find why there is such high cpu utilization, hope someone can help.
Hi @ericmclachlan, sorry for the inconvenience. We have an internal task tracking this resource inefficiency issue, will keep you posted when it gets fixed. Meanwhile, feel free to contribute the repo directly if you have any ideas about the batch scheduler.
Eric is right. The unexpected high CPU utilization is caused by the batching configuration. The four threads will keep busy waiting for requests and thus consume resources. The CPU usage and the num_batch_threads are proportional.