dbt-core: [CT-285] [Bug] on-run-end hooks execute before on-run-start when the command is `dbt docs generate`

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

For some reason, on-run-end hooks execute before on-run-start hooks when the command is dbt docs generate.

Expected Behavior

Expect on-run-end to execute AFTER on-run-start just as it does with the dbt run command.

Steps To Reproduce

Create a table in Snowflake to record some stuff down.

create or replace table development.dbt_jyeo.my_test (
    command string, which_hook string, insert_at timestamp
);

Add this macro to your project.

{% macro do_thing(hook) %}

{{ log('Running during ' ~ hook, True) }}

{% set query %}
  insert into development.dbt_jyeo.my_test values (
    '{{ flags.WHICH }}', '{{ hook }}', current_timestamp()
  )
{% endset %}

{% do run_query(query) %}

{% endmacro %}

Use the macro in our hooks.

# dbt_project.yml
name: "snowflake"
version: "1.0.0"
config-version: 2

profile: "snowflake"

model-paths: ["models"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

target-path: "target"
clean-targets:
  - "target"
  - "dbt_modules"
  - "dbt_packages"

models:
  snowflake:
    +materialized: table

on-run-start: "{{ do_thing('start') }}"

on-run-end: "{{ do_thing('end') }}"

Do a run to see what happens (I have a simple one model project and the threads are set to 1).

dbt run

Expand output

03:40:44  Running with dbt=1.0.2
03:40:44  Running during start
03:40:44  Running during end
03:40:45  Found 1 model, 0 tests, 0 snapshots, 0 analyses, 181 macros, 2 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
03:40:45  
03:40:50  
03:40:50  Running 1 on-run-start hook
03:40:50  Running during start
03:40:53  1 of 1 START hook: snowflake.on-run-start.0..................................... [RUN]
03:40:53  1 of 1 OK hook: snowflake.on-run-start.0........................................ [OK in 0.00s]
03:40:53  
03:40:53  Concurrency: 1 threads (target='dev')
03:40:53  
03:40:53  1 of 1 START table model dbt_jyeo.my_item....................................... [RUN]
03:40:57  1 of 1 OK created table model dbt_jyeo.my_item.................................. [SUCCESS 1 in 3.42s]
03:40:57  
03:40:57  Running 1 on-run-end hook
03:40:57  Running during end
03:41:01  1 of 1 START hook: snowflake.on-run-end.0....................................... [RUN]
03:41:01  1 of 1 OK hook: snowflake.on-run-end.0.......................................... [OK in 0.00s]
03:41:01  
03:41:01  
03:41:01  Finished running 1 table model, 2 hooks in 16.71s.
03:41:01  
03:41:01  Completed successfully
03:41:01  
03:41:01  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Do a docs generate (you can repeat this a handful of times too).

dbt docs generate

Expand output

03:42:34  Running with dbt=1.0.2
03:42:35  Running during start
03:42:35  Running during end
03:42:35  Found 1 model, 0 tests, 0 snapshots, 0 analyses, 181 macros, 2 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
03:42:35  
03:42:38  Concurrency: 1 threads (target='dev')
03:42:38  
03:42:38  Running during end
03:42:41  Running during start
03:42:44  Done.
03:42:44  Building catalog
03:42:48  Catalog written to /Users/jeremy/src/dbt-sandcastles/dbt/snowflake/target/catalog.json

Query the table we created earlier.

Relevant log output

No response

Environment

- OS: macOS 12
- Python: 3.9.7
- dbt: 1.0.2

What database are you using dbt with?

No response

Additional Context

I believe this is likely different to the chain of issues starting at #3530 that’s ultimately resulted in “users can disable hooks from running by using flags.WHICH” - this issue is more about the sequence of operations being weird / reversed. I thought it might be because of multiple threads but have explicitly set threads to be 1 in the reproduction above.

Have the debug logs but they exceed the max character limit of a GitHub comment.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 15 (13 by maintainers)

Most upvoted comments

@jtcohen6 / @iknox-fa I think we should at minimum do as per @jtcohen6 suggestion:

Or, much simpler: We could raise a warning if any on-run-* hook is defined and doesn’t resolve to a meaningful string. (We expect all hooks to render to string-SQL, which dbt can execute at the right time.)

Raise warnings if:

on-run-* hooks don’t return meaningful strings.
on-run-* hooks have macros in them with calls to run_query ? Warn that they may be executed at a time when you may not expect them to (e.g. during the compile phase).
Non on-run-* hooks with macros that return simple strings (since they won’t be executed - but the user may think it does - given the nature of on-run-* hooks being able to execute strings) ?

Just trying to think of all the scenarios where a warning can be helpful.

jeremyyeo on Apr 15, 2022