sagemaker-training-toolkit: Entry point package doesn't seem to work with nested directories

Hey there!

I’m having some trouble getting my Sagemaker Tensorflow code to work after moving my script to another directory.

Previously, I had the following directory structure:

submit_notebook.ipynb
train.py
setup.py
my_package/
  other modules

And it worked with source_dir=“.” and entry_point=“train.py”.

Now, I recently moved my training script into one of my package directories as follows:

submit_notebook.ipynb
setup.py
src/
  my_package/
    train.py
    other modules

When running estimator.fit with source_dir=“.” and entry_point=“src/my_package/train.py”, I get an ImportError: "No module named src/my_package/train".

Higher up in the logs, I spotted: “Invoking script with the following command: /usr/bin/python -m src/my_package/train <some_args>”

After starting in sagemaker-tensorflow-container, I saw that sagemaker_containers._entry_point_type has a check that if there’s a “setup.py” file, the entry_point type is PYTHON_PACKAGE.

Later in sagemaker_containers._process, we take any PYTHON_PACKAGE user-given entrypoint string and remove the .py extension.

That makes sense if your entry_point is “train.py”, but as mentioned above introduces weirdness when there are directories in the way.

Describing my proposed fix in the PR

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 1
  • Comments: 18 (2 by maintainers)

Most upvoted comments

+1 The issue is still present, would be wonderful if you find time to work on it.