chispa: The it_does_not_throw_with_different_schema test exposes a bug
This test shouldn’t be passing:
def it_does_not_throw_with_different_schema():
data1 = [(1.0, "jose"), (1.1, "li"), (1.2, "laura"), (None, None)]
df1 = spark.createDataFrame(data1, ["num", "expected_name"])
data2 = [("li", 1.05), ("laura", 1.2), (None, None), ("jose", 1.0)]
df2 = spark.createDataFrame(data2, ["another_name", "same_num"])
assert_approx_df_equality(df1, df2, 0.1, ignore_schema=True)
ignore_row_order=False
isn’t set, so this shouldn’t be passing.
This is because of empty set returned in d1.keys() & d2.keys(), when the column names are different. The conditions are actually not checked at all and returning True
.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 17 (12 by maintainers)
I can work on it! 😃