ploomber: "ploomber scaffold" should create missing modules when scaffolding functions

If ploomber scaffold finds a pipeline.yaml it checks all tasks[*].sources and creates files for all tasks whose source is missing. e.g.,

tasks:
    - source: some_module.some_function
      product: output.csv

If some_module.py exists, ploomber scaffold adds a function definition there. However, if it doesn’t, it fails.

Calling the command should create all necessary modules. Note that this could be a nested module. e.g.,

tasks:
    - source: some_module.submodule.function
      product: output.csv

This must create some_module/ (directory), some_module/__init__.py, and some_module/submodule.py, the call the already implemented logic to create a function inside some_module/submodule.py.

loader.create implements the file creation logic: https://github.com/ploomber/ploomber/blob/ac915ea998f4da0177204f2189f8f608f3404fc6/src/ploomber/scaffold/__init__.py#L42

add tests here: https://github.com/ploomber/ploomber/blob/master/tests/cli/test_scaffold.py

Tasks

  • Implement ‘skip’ logic to DAGSpec and TAskSpec: passing lazy_import='skip' should not perform any dotted paths validation and add tests with lazy_import='skip'
  • Implement ploomber scaffold feature that creates missing modules
  • Rename lazy_import flag (not sure about what would be a good name, maybe import_mode?)

Please open a PR with scaffold-functions as the base branch after finishing any of the three tasks above so we can review it and move to the next one.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (15 by maintainers)

Most upvoted comments