mlflow: Can't import mlflow due to protobuf update to version 4.21 [BUG]
System information
mlflow version: 1.26 python version: 3.7
Describe the problem
Cannot import mlflow with the latest dependency of protobuf (4.21)
Tracking information
No response
Code to reproduce issue
import mlflow
Other info / logs
TypeError Traceback (most recent call last)
<command-4092054516117478> in <module>
----> 1 import mlflow
/local_disk0/pythonVirtualEnvDirs/virtualEnv-b0db65a3-9256-450f-9467-e817ece7ad9e/lib/python3.7/importlib/_bootstrap.py in _find_and_load(name, import_)
/local_disk0/pythonVirtualEnvDirs/virtualEnv-b0db65a3-9256-450f-9467-e817ece7ad9e/lib/python3.7/importlib/_bootstrap.py in _find_and_load_unlocked(name, import_)
/local_disk0/pythonVirtualEnvDirs/virtualEnv-b0db65a3-9256-450f-9467-e817ece7ad9e/lib/python3.7/importlib/_bootstrap.py in _load_unlocked(spec)
/local_disk0/pythonVirtualEnvDirs/virtualEnv-b0db65a3-9256-450f-9467-e817ece7ad9e/lib/python3.7/importlib/_bootstrap.py in _load_backward_compatible(spec)
/local_disk0/tmp/1653558472209-0/PostImportHook.py in load_module(self, fullname)
187
188 def load_module(self, fullname):
--> 189 module = self.loader.load_module(fullname)
190 notify_module_loaded(module)
191
/databricks/python/lib/python3.7/site-packages/mlflow/__init__.py in <module>
30 from mlflow.version import VERSION as __version__ # pylint: disable=unused-import
31 from mlflow.utils.logging_utils import _configure_mlflow_loggers
---> 32 import mlflow.tracking._model_registry.fluent
33 import mlflow.tracking.fluent
34
/databricks/python/lib/python3.7/site-packages/mlflow/tracking/__init__.py in <module>
6 """
7
----> 8 from mlflow.tracking.client import MlflowClient
9 from mlflow.tracking._tracking_service.utils import (
10 set_tracking_uri,
/local_disk0/pythonVirtualEnvDirs/virtualEnv-b0db65a3-9256-450f-9467-e817ece7ad9e/lib/python3.7/importlib/_bootstrap.py in _find_and_load(name, import_)
/local_disk0/pythonVirtualEnvDirs/virtualEnv-b0db65a3-9256-450f-9467-e817ece7ad9e/lib/python3.7/importlib/_bootstrap.py in _find_and_load_unlocked(name, import_)
/local_disk0/pythonVirtualEnvDirs/virtualEnv-b0db65a3-9256-450f-9467-e817ece7ad9e/lib/python3.7/importlib/_bootstrap.py in _load_unlocked(spec)
/local_disk0/pythonVirtualEnvDirs/virtualEnv-b0db65a3-9256-450f-9467-e817ece7ad9e/lib/python3.7/importlib/_bootstrap.py in _load_backward_compatible(spec)
/local_disk0/tmp/1653558472209-0/PostImportHook.py in load_module(self, fullname)
187
188 def load_module(self, fullname):
--> 189 module = self.loader.load_module(fullname)
190 notify_module_loaded(module)
191
/databricks/python/lib/python3.7/site-packages/mlflow/tracking/client.py in <module>
14 from typing import Any, Dict, Sequence, List, Optional, Union, TYPE_CHECKING
15
---> 16 from mlflow.entities import Experiment, Run, RunInfo, Param, Metric, RunTag, FileInfo, ViewType
17 from mlflow.store.entities.paged_list import PagedList
18 from mlflow.entities.model_registry import RegisteredModel, ModelVersion
/databricks/python/lib/python3.7/site-packages/mlflow/entities/__init__.py in <module>
4 """
5
----> 6 from mlflow.entities.experiment import Experiment
7 from mlflow.entities.experiment_tag import ExperimentTag
8 from mlflow.entities.file_info import FileInfo
/databricks/python/lib/python3.7/site-packages/mlflow/entities/experiment.py in <module>
1 from mlflow.entities._mlflow_object import _MLflowObject
----> 2 from mlflow.entities.experiment_tag import ExperimentTag
3 from mlflow.protos.service_pb2 import (
4 Experiment as ProtoExperiment,
5 ExperimentTag as ProtoExperimentTag,
/databricks/python/lib/python3.7/site-packages/mlflow/entities/experiment_tag.py in <module>
1 from mlflow.entities._mlflow_object import _MLflowObject
----> 2 from mlflow.protos.service_pb2 import ExperimentTag as ProtoExperimentTag
3
4
5 class ExperimentTag(_MLflowObject):
/databricks/python/lib/python3.7/site-packages/mlflow/protos/service_pb2.py in <module>
16
17
---> 18 from .scalapb import scalapb_pb2 as scalapb_dot_scalapb__pb2
19 from . import databricks_pb2 as databricks__pb2
20
/databricks/python/lib/python3.7/site-packages/mlflow/protos/scalapb/scalapb_pb2.py in <module>
33 message_type=None, enum_type=None, containing_type=None,
34 is_extension=True, extension_scope=None,
---> 35 serialized_options=None, file=DESCRIPTOR)
36 MESSAGE_FIELD_NUMBER = 1020
37 message = _descriptor.FieldDescriptor(
/databricks/python/lib/python3.7/site-packages/google/protobuf/descriptor.py in __new__(cls, name, full_name, index, number, type, cpp_type, label, default_value, message_type, enum_type, containing_type, is_extension, extension_scope, options, serialized_options, has_default_value, containing_oneof, json_name, file, create_key)
558 has_default_value=True, containing_oneof=None, json_name=None,
559 file=None, create_key=None): # pylint: disable=redefined-builtin
--> 560 _message.Message._CheckCalledFromGeneratedFile()
561 if is_extension:
562 return _message.default_pool.FindExtensionByName(full_name)
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
What component(s) does this bug affect?
-
area/artifacts: Artifact stores and artifact logging -
area/build: Build and test infrastructure for MLflow -
area/docs: MLflow documentation pages -
area/examples: Example code -
area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry -
area/models: MLmodel format, model serialization/deserialization, flavors -
area/projects: MLproject format, project running backends -
area/scoring: MLflow Model server, model deployment tools, Spark UDFs -
area/server-infra: MLflow Tracking server backend -
area/tracking: Tracking Service, tracking client APIs, autologging
What interface(s) does this bug affect?
-
area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server -
area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models -
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry -
area/windows: Windows support
What language(s) does this bug affect?
-
language/r: R APIs and clients -
language/java: Java APIs and clients -
language/new: Proposals for new client languages
What integration(s) does this bug affect?
-
integrations/azure: Azure and Azure ML integrations -
integrations/sagemaker: SageMaker integrations -
integrations/databricks: Databricks integrations
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 7
- Comments: 15 (7 by maintainers)
Commits related to this issue
- Specify protobuf version to fix import mlflow error The current code will fail at the data_prep component due to the import mlflow issue, see https://github.com/mlflow/mlflow/issues/5949. Requiring ... — committed to thongonary/azureml-examples by thongonary 2 years ago
- Build: downgrade protobuf to 3.20.1 (https://github.com/mlflow/mlflow/issues/5949) — committed to tanlin2013/mbl by tanlin2013 2 years ago
- Update MlFlow Version Current version affected by https://github.com/mlflow/mlflow/issues/5949 --MlFlow won't start due to this if deployed as is — committed to cmosh/amazon-sagemaker-mlflow-fargate by cmosh 2 years ago
- Pin protobuf version to 3.20.0 Pins the protobuf version as a workaround to this issue: https://github.com/mlflow/mlflow/issues/5949 — committed to cmosh/amazon-sagemaker-mlflow-fargate by cmosh 2 years ago
Same for mlflow version 1.11.0. I have set the protobuf to 3.2.0 and works. But I think that mlflow should have all its requirements freezed to and specific version.
Semantic versioning requires that packages only make major breaking changes when changing the major version.
I think that at any point the mlflow package should have upper limit on the major version of each direct dependency. When new version of a direct dependency comes out, MLFlow team can bump the upper limit and test for regressions.
The possibility of permanent breakage like in this case would be order of magnitude smaller than now.
Well yeah and that’s a problem, basically any version before the latest is basically broken, it can work but requires investigation from the user. Limiting to the next major release of dependencies seems like a no brainer.
In practice now that the update is done for protobuf 4 it should be something like
protobuf >=3.12, <5.0.@jhallard mlflow 1.26.1 is released last weeek: https://github.com/mlflow/mlflow/releases/tag/v1.26.1
Given that
mlflow==1.26.1isnt released yet, how do I fix this for themlflow sagemaker build-and-push-containercommand? It appears to just install the latest protobuf under the hood while constructing the container which means it doesn’t use my system protobufUpgrading to later versions of mlflow (1.26.0, 1.27.0) and/or downgrading protobuf (3.20.0, 3.20.1…) did not work for me… any other suggestions or fixes to this issue?
With protobuf==3.20.1 it works fine. I guess a quick solution would be to set the requirement of protobuf to <4.0.0 in the
setup.pyfile.@diego-pm Can you limit
protobufto< 4.0.0or setPROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python?