dbt-core: [CT-3316] [Bug] Invalid JSON schema for manifest v11

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Manifest v11 JSON schema has been released with dbt 1.7.0. However, it’s not a valid JSON schema:

{
  "$ref": "#/$defs/WritableManifest",
  "$defs": {
    "ManifestMetadata": {
      "type": "object",
      "title": "ManifestMetadata",
      ...
  }
}

Seems like $defs.ManifestMetadata contains the actual JSON schema that’s supposed to be published.

Expected Behavior

A valid JSON schema to be published.

Steps To Reproduce

  1. Copy https://schemas.getdbt.com/dbt/manifest/v11.json
  2. Paste it into a JSON schema validator, e.g. https://www.jsonschemavalidator.net/

Relevant log output

No response

Environment

- OS:
- Python:
- dbt:

Which database adapter are you using with dbt?

No response

Additional Context

No response

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 15 (8 by maintainers)

Most upvoted comments

@mars-lan thank you for surfacing this and thank you to Metaphor Data for sponsoring Coalesce! I hope the conference went well for you all. Wanted to update you and let you know that engineers are looking into this and will update shortly.

What doesn’t work is putting it in the left hand box of https://www.jsonschemavalidator.net/ to show that it can be used as a custom schema to validate json.

Just wanted to break in and say that https://www.jsonschemavalidator.net/ doesn’t support JSON Schema 2020-12 and seems not to be the best validator. The v11.json schema looks good in the eyes of these validators:

All of them support 2022-12 dialect and are listed on the official list of the online JSON Schema validators: https://json-schema.org/implementations#web-(online).

Also I’d like to note that using “$ref” and “$defs” keywords together is valid, as can be seen from this example:

@mars-lan Just to confirm, are you saying that v11 manifest currently hosted is working for you or that you pulled the v12 manifest from the PR and that is working for you?

The latest published v11 is working. I haven’t tested v12 yet, and doesn’t seem like there’s one published either: https://schemas.getdbt.com/dbt/manifest/v12/index.html

Confirmed that the latest JSON schema is passing. Feel free to close this now.

I synced with @dbeatty10. Although the v11.json manifest artifact can be used to produce models with the datamodel-code-generator. What is currently broken is using the artifact as custom schema to validate input json (which it used to do). In my demo, I put the schema in the right hand box on https://www.jsonschemavalidator.net/ which verified that it was valid jsonschema. What doesn’t work is putting it in the left hand box of https://www.jsonschemavalidator.net/ to show that it can be used as a custom schema to validate json.

@QMalcolm I was able to reproduce with catalog/v1.json. The original poster reported issues with the schema for manifest.json, but I had a small catalog.json handy, and it exhibits the same underlying issue with https://www.jsonschemavalidator.net/.

Reprex

  1. Copy the contents of https://schemas.getdbt.com/dbt/catalog/v1.json (or here) into the Select schema text area on the left side of https://www.jsonschemavalidator.net/
    • Select “Custom” from the drop-down
  2. Copy the following contents into the Input JSON text area on the right side
    {
      "metadata": {
        "dbt_schema_version": "https://schemas.getdbt.com/dbt/catalog/v1.json",
        "dbt_version": "1.7.0b2",
        "generated_at": "2023-10-07T13:34:04.627535Z",
        "invocation_id": "c09b4da4-35c5-418d-8a4b-447d2925f075",
        "env": {}
      },
      "nodes": {},
      "sources": {},
      "errors": null
    }
    
  3. You should see an error like the following: image

But if you replace the Select schema with the contents from here, it should validate:

image

@mars-lan I am currently working on this

@mars-lan Thanks for opening this issue. We have reproduced the behavior you discovered using https://www.jsonschemavalidator.net/, and we find it concerning, but we are curious whether you have encountered problems with the schema with any other tools, or other parts of your workflow. In other validators we have tried, and with the libraries we use internally for testing, the schema performed as we expected.

We’ll follow this up regardless, but knowing the above would help us better understand the scope of the issue.

We used https://github.com/koxudaxi/datamodel-code-generator to generate Python classes from the JSON schema and it also had issues parsing the schema.