mlflow: [BUG] log_artifact fails when tracking uri scheme is 'file'

Issues Policy acknowledgement

  • I have read and agree to submit bug reports in accordance with the issues policy

Willingness to contribute

No. I cannot contribute a bug fix at this time.

MLflow version

  • Client: 2.1.1

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04
  • Python version: 3.9.12

Describe the problem

I’m using ml flow on localhost (as described here). When I call log_artifact for live run I get the following error : mlflow.exceptions.MlflowException: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'https', 'http'}

Tracking information

MLflow module location: /home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/__init__.py
Tracking URI: file:///efs/mlflow/mlruns
Registry URI: file:///efs/mlflow/mlruns
MLflow environment variables: 
  MLFLOW_EXPERIMENT_NAME: object_detection
MLflow dependencies: 
  Flask: 2.2.2
  Jinja2: 3.1.2
  alembic: 1.9.3
  click: 8.1.3
  cloudpickle: 2.2.1
  databricks-cli: 0.17.4
  docker: 6.0.1
  entrypoints: 0.4
  gitpython: 3.1.30
  gunicorn: 20.1.0
  importlib-metadata: 5.0.0
  markdown: 3.4.1
  matplotlib: 3.6.2
  numpy: 1.23.4
  packaging: 21.3
  pandas: 1.5.1
  protobuf: 3.20.3
  pyarrow: 10.0.1
  pytz: 2022.6
  pyyaml: 6.0
  querystring-parser: 1.2.4
  requests: 2.27.1
  scikit-learn: 1.2.1
  scipy: 1.9.3
  shap: 0.41.0
  sqlalchemy: 1.4.46
  sqlparse: 0.4.3

Code to reproduce issue

import mlflow
mlflow_exp_name = 'object_detection'
mlflow.set_tracking_uri('file:///efs/mlflow/mlruns')
os.environ['MLFLOW_EXPERIMENT_NAME'] = mlflow_exp_name

path_html = 'example.html'
with open(path_html, "w") as f:
    f.write('')
with mlflow.start_run():
    mlflow.log_artifact(path_html)

Stack trace

Traceback (most recent call last):
  File "/efs/demo.py", line 36, in <module>
    mlflow.log_artifact(path_html)
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/tracking/fluent.py", line 776, in log_artifact
    MlflowClient().log_artifact(run_id, local_path, artifact_path)
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/tracking/client.py", line 1002, in log_artifact
    self._tracking_client.log_artifact(run_id, local_path, artifact_path)
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/tracking/_tracking_service/client.py", line 431, in log_artifact
    artifact_repo = self._get_artifact_repo(run_id)
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/tracking/_tracking_service/client.py", line 416, in _get_artifact_repo
    artifact_repo = get_artifact_repository(artifact_uri)
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 106, in get_artifact_repository
    return _artifact_repository_registry.get_artifact_repository(artifact_uri)
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 72, in get_artifact_repository
    return repository(artifact_uri)
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 46, in __init__
    super().__init__(self.resolve_uri(artifact_uri, get_tracking_uri()))
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 61, in resolve_uri
    _validate_uri_scheme(track_parse.scheme)
  File "/home/ubuntu/miniconda3/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 35, in _validate_uri_scheme
    raise MlflowException(
mlflow.exceptions.MlflowException: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'http', 'https'}

Process finished with exit code 1

Other info / logs

REPLACE_ME

What component(s) does this bug affect?

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

What language(s) does this bug affect?

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 5
  • Comments: 16 (1 by maintainers)

Commits related to this issue

Most upvoted comments

This worked for me. If you are using local files either of the below solutions should work. Add this before mlflow.start_run() in your code.

import mlflow

# Set the tracking URI to a local directory
mlflow.set_tracking_uri("file:/path/to/your/local/directory")

# OR
mlflow.set_tracking_uri("http://127.0.0.1:5000")

I have a similar problem, but the same stacktrace from …/artifact/… . I did:

$ mlflow gc --backend-store-uri=sqlite:///mlflow.db

and I get the following error:

# mlflow gc --older-than=1h  --backend-store-uri=sqlite:///mlflow.db 
2023/03/20 04:19:31 INFO mlflow.store.db.utils: Creating initial MLflow database tables...
2023/03/20 04:19:31 INFO mlflow.store.db.utils: Updating database tables
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
Run with ID b27d67938f88468a8430407955b9ebaa has been permanently deleted.
Run with ID 02bd3de0e9a348a1ae7938a7851891ac has been permanently deleted.
Run with ID cc5565e682f14fc89c52291891ccf95b has been permanently deleted.
Run with ID 732d05c7b13b4c08a9bc8715bcd75e58 has been permanently deleted.
Traceback (most recent call last):
  File "/usr/local/bin/mlflow", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/mlflow/cli.py", line 573, in gc
    artifact_repo = get_artifact_repository(run.info.artifact_uri)
  File "/usr/local/lib/python3.9/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 106, in get_artifact_repository
    return _artifact_repository_registry.get_artifact_repository(artifact_uri)
  File "/usr/local/lib/python3.9/site-packages/mlflow/store/artifact/artifact_repository_registry.py", line 72, in get_artifact_repository
    return repository(artifact_uri)
  File "/usr/local/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 45, in __init__
    super().__init__(self.resolve_uri(artifact_uri, get_tracking_uri()))
  File "/usr/local/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 59, in resolve_uri
    _validate_uri_scheme(track_parse.scheme)
  File "/usr/local/lib/python3.9/site-packages/mlflow/store/artifact/mlflow_artifacts_repo.py", line 35, in _validate_uri_scheme
    raise MlflowException(
mlflow.exceptions.MlflowException: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'http', 'https'}

Hi @yossibiton, thank you for raising this issue. The problem appears to be that your MLflow experiment with name object_detection was created using an HTTP request to mlflow server or was created by manually specifying an mlflow-artifacts:// URI as the artifact_location. In order to log artifacts to this experiment, you’ll need to run your mlflow server and set the MLflow Tracking URI to communicate with the MLflow server, e.g. something like mlflow.set_tracking_uri("http://127.0.0.1:5000").

Thank you for using MLflow!

hey 👋🏻

I was following that article https://mlflow.org/docs/latest/quickstart_mlops.html. So I faced that issue. export MLFLOW_TRACKING_URI=http://127.0.0.1:5002 that command resolves the problem.

This issue can be solved by exporting the enviroment variable MLFLOW_TRACKING_URI ! 👍

export MLFLOW_TRACKING_URI=http://127.0.0.1:<PORT NUMBER>

Hi everyone, I have recently faced the same issue while starting mlflow ui from my local docker container.

I was following official mlflow tutorials, from this. The code is following:

import mlflow
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes
from sklearn.ensemble import RandomForestRegressor

mlflow.autolog()

db = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(db.data, db.target)
rf = RandomForestRegressor(n_estimators=100, max_depth=6, max_features=3)
rf.fit(X_train, y_train)
predictions = rf.predict(X_test)

Which outputs this traceback:

(base) root@85d11d68ad08:/workspace/cv_dev/MLOps# python test1.py 
2023/05/06 18:04:31 INFO mlflow.tracking.fluent: Autologging successfully enabled for sklearn.
2023/05/06 18:04:31 INFO mlflow.utils.autologging_utils: Created MLflow autologging run with ID 'e116db062f054f0c9584041c92426f81', which will track hyperparameters, performance metrics, model artifacts, and lineage information for the current sklearn workflow
2023/05/06 18:04:31 WARNING mlflow.utils.autologging_utils: Encountered unexpected error during sklearn autologging: The configured tracking uri scheme: 'file' is invalid for use with the proxy mlflow-artifact scheme. The allowed tracking schemes are: {'https', 'http'}

IMPORTANT, when I repeat this scenario with running mlflow ui from standard terminal (not a docker container), everything works fine.

@yossibiton Any updates here? If you’re working on a PR, please link it to this issue.

@dean-sh has the right workaround. Essentially just make sure the tracking db points to anything other than its default location. --backend-store-uri mysql+pymysql://root@localhost/mlflow_tracking_database. Otherwise it seems to have a conflict over that route being used.

Any update? I’m trying to serve mlflow model using mlserver, mlflow models serve -m runs:/df738f66448047aca9a6a2e8c6982ed9/model --enable-mlserver and getting the exact same error. My server is running with this config:

mlflow server \
   --backend-store-uri  mysql+pymysql://root@localhost/mlflow_tracking_database \
   --default-artifact-root  file:/./mlruns \
   -h 0.0.0.0 -p 5000