BentoML: Service with sklearn model fails on my EKS cluster
I have created a simple service:
model_runner = bentoml.sklearn.load_runner("mymodel:latest")
svc = bentoml.Service("myservice", runners=[model_runner])
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def classify(input_series: np.ndarray) -> np.ndarray:
return model_runner.run(input_series)
When I run it on my laptop (MacBook Pro M1), using
bentoml serve ./service.py:svc --reload
everything works fine when I invoke the generated classify
API.
Now when I push this service to my Yatai server as a bento and deploy it to my K8s cluster (EKS), I get the following error when I invoke the API:
Looking at the code, the problem lies in https://github.com/bentoml/BentoML/blob/119b103e2417291b18127d64d38f092893c8de4f/bentoml/_internal/frameworks/sklearn.py#L163
In my case, _num_threads
answers 0.
Digging a bit further, resource_quota.cpu
is computed here: https://github.com/bentoml/BentoML/blob/119b103e2417291b18127d64d38f092893c8de4f/bentoml/_internal/runner/utils.py#L208.
Here are the values I get on the pod running the API:
source | value |
---|---|
file /sys/fs/cgroup/cpu/cpu.cfs_quota_us |
-1 |
file /sys/fs/cgroup/cpu/cpu.cfs_period_us |
100000 |
file /sys/fs/cgroup/cpu/cpu.shares |
2 |
call to os.cpu_count() |
2 |
Given those values, query_cgroup_cpu_count()
will return 0.001953125
, which once rounded will end up as 0, meaning n_jobs
will alway be 0. So the call will always fail on my pods.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 2
- Comments: 17 (3 by maintainers)
One of our developers thinks we’ve identified the issue. Please standby for commit and release. Will get back to you with an eta.
Thanks for the help in identifying this issue!!!