great_expectations: batch_request passed in to SimpleCheckpoint errors
Describe the bug
When creating a SimpleCheckpoint and passing in a batch_request: BatchRequest, it seems to treat the batch_request as a dictionary instead of a BatchRequest and errors out AttributeError: 'BatchRequest' object has no attribute 'items' here
To Reproduce Steps to reproduce the behavior:
import great_expectations as ge
from great_expectations.core.batch import RuntimeBatchRequest
from great_expectations.cli.datasource import sanitize_yaml_and_save_datasource
from great_expectations.checkpoint import SimpleCheckpoint
import pandas as pd
context = ge.get_context()
df = pd.DataFrame({"col1": ["a", "a", "b", "c"], "col2": [1, 2, 3, 4]})
config = """
name: my_pandas_datasource
class_name: Datasource
execution_engine:
class_name: PandasExecutionEngine
data_connectors:
my_runtime_data_connector:
class_name: RuntimeDataConnector
batch_identifiers:
- some_batch_identifier_so_this_can_work
"""
context.test_yaml_config(
yaml_config=config
)
sanitize_yaml_and_save_datasource(context, config, overwrite_existing=True)
runtime_batch_request = RuntimeBatchRequest(
datasource_name="my_pandas_datasource",
data_connector_name="my_runtime_data_connector",
data_asset_name="insert_your_data_asset_name_here",
runtime_parameters={
"batch_data": df
},
batch_identifiers={
"some_batch_identifier_so_this_can_work": "blah",
}
)
my_checkpoint = SimpleCheckpoint(
name="my_checkpoint",
data_context=context,
batch_request = runtime_batch_request
)
>>
AttributeError Traceback (most recent call last)
<ipython-input-1-b72466658554> in <module>
43 name="my_checkpoint",
44 data_context=context,
---> 45 batch_request = runtime_batch_request
46 )
~/.pyenv/versions/anaconda3-2020.02/envs/sandbox/lib/python3.7/site-packages/great_expectations/checkpoint/checkpoint.py in __init__(self, name, data_context, config_version, template_name, module_name, class_name, run_name_template, expectation_suite_name, batch_request, action_list, evaluation_parameters, runtime_configuration, validations, profilers, validation_operator_name, batches, site_names, slack_webhook, notify_on, notify_with, **kwargs)
672 slack_webhook=slack_webhook,
673 notify_on=notify_on,
--> 674 notify_with=notify_with,
675 ).build()
676
~/.pyenv/versions/anaconda3-2020.02/envs/sandbox/lib/python3.7/site-packages/great_expectations/checkpoint/configurator.py in build(self)
112 self._validate_slack_configuration()
113
--> 114 return self._build_checkpoint_config()
115
116 def _build_checkpoint_config(self) -> CheckpointConfig:
~/.pyenv/versions/anaconda3-2020.02/envs/sandbox/lib/python3.7/site-packages/great_expectations/checkpoint/configurator.py in _build_checkpoint_config(self)
134 "config_version": self.other_kwargs.pop("config_version", 1.0)
135 or 1.0,
--> 136 **self.other_kwargs,
137 }
138 )
~/.pyenv/versions/anaconda3-2020.02/envs/sandbox/lib/python3.7/site-packages/great_expectations/data_context/types/base.py in update(self, other_config, runtime_kwargs)
1743 updated_batch_request = nested_update(
1744 batch_request,
-> 1745 other_batch_request,
1746 )
1747 self._batch_request = updated_batch_request
~/.pyenv/versions/anaconda3-2020.02/envs/sandbox/lib/python3.7/site-packages/great_expectations/core/util.py in nested_update(d, u, dedup)
69 ):
70 """update d with items from u, recursively and joining elements"""
---> 71 for k, v in u.items():
72 if isinstance(v, Mapping):
73 d[k] = nested_update(d.get(k, {}), v, dedup=dedup)
AttributeError: 'RuntimeBatchRequest' object has no attribute 'items'
Expected behavior
I expected the above to work since I’m passing in batch_request: BatchRequest
Environment (please complete the following information):
- Operating System: MacOS
- Great Expectations Version: 0.13.17
Additional context None
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 16 (7 by maintainers)
@cdkini I’m very new on great expectations, I want to work with custom query using
BigQueryI tried to following this code , base on this url : How to load a database table, view, or query result as a batch
I facing this error :
Here is the python script
I cannot find out on stackoverflow of github issue about this issue any help from your side ?
Thanks
Hey @lyra-victor @geertjan-garvis @rr-chiranjeevi-ds thanks so much for point this out!
I apologize for the delay; the core team has been focusing on a few other features so this fell into our backlog. @bhcastleton put this on my radar earlier today and I’ll definitely be making it a priority moving forward.
It looks like there’s a mismatch of types when we get to the
nested_updatefunction so I’ll have to do a bit of digging to see why that is. To be fully transparent, I am a bit new to this part of the codebase but I’ll keep this thread updated with any findings.Thanks again for your patience 🙏🏽
A PR has been issued and is going through the review process. You can see the changes made at #3152 and make any comments/suggestions there if you wish!
We’ll hopefully have this merged and ready to use shortly.
@lyra-victor Thank you for reporting. We will post here once we look at this issue deeper.