azure-sdk-for-python: [AzureML v2][YAML Components] ModuleNotFoundError when trying to import local modules that are passed in the component

  • Package Name: azure-ai-ml
  • Package Version:
  • Operating System:
  • Python Version: 3.9

Describe the bug I’m trying to use the YAML component SDK to build AzureML pipelines. My code is structured as such:

|-- components
|  |- component.yml
|-- src
|  |-- script
|  |  |- script.py
|  |- utils.py
|-- notebooks
|  |- notebook.ipynb

My component.yml looks like:

$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
type: command

name: component_name
version: 0.0.1
display_name: XXX
description: XXX
inputs:
  input_a:
    type: uri_folder
outputs:
  output_a:
    type: uri_folder
environment: XXX
code: ../
command: >-
  python src/script/script.py  --input_a$ {{inputs.input_a}} --output_a ${{outputs.output_a}}

In script.py we have imports of the sort:

from src.utils import func_a

which fails with a ModuleNotFoundError: No module named 'src'

I previously had a notebook (in notebooks/notebook.ipynb) with a running pipeline which was submitting a PythonScriptStep like so:

PythonScriptStep(
    name="XXX",
    source_directory="../",
    script_name='src/script/script.py',
    arguments=[...
    ],
    compute_target=cluster,
    runconfig=runconfig,
)

To Reproduce Steps to reproduce the behavior:

  1. The above description should be enough.

Expected behavior I expect the YAML component I wrote to work roughly the same way, but it seems that the path where python is looking for packages is not the same. With the PythonScriptStep, using source_directory adds the directory to the PYTHONPATH (I see something like /mnt/azureml/cr/j/e6050824d89840da9dd732469f135109/exe/wd/ in the output of print(sys.path), but that’s not the case with the code in the YAML component.

The src folder is present in both cases when I do a print(os.listdir('.'))

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 1
  • Comments: 16 (3 by maintainers)

Most upvoted comments

Hi @cloga, I’m not using az ml component create, I’m actually using a python notebook (in notebooks/pipeline.ipynb to run a pipeline which contains my YAML-defined component.

Inside the notebook, it looks something like this:

component_step = load_component(path="components/component.yml")

@pipeline()
def my_pipeline(input_data):
    my_step = component_step(input_data)

# Create pipeline job
pipeline_job = my_pipeline(input_data=Input(type="uri_folder", path="path_to_my_data"))

# Submit pipeline job to workspace
pipeline_job = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="pipeline_samples"
)
# Wait until the job completes
ml_client.jobs.stream(pipeline_job.name) # -> This fails because running the step gives the ModuleNotFoundError