ibis: bug: error on `pivot_longer` of more than one category, using Polars backend

What happened?

When attempting to pivot a table with more than one value in the ‘retained’ or category column, the error (below) is thrown when using Polar backend, but works fine with DuckDB backend. The same code works fine if there is only one value in the retained column.

Code to reproduce:

"""Polars backend throws an error if more than one category is pivoted."""

import ibis
from ibis import _
import ibis.expr.selectors as s
import pandas as pd

data = {
    "product": ["apples", "oranges"],
    "price": [4, 7],
    "qty": [42, 84]
}

df = pd.DataFrame(data)

conn_duck = ibis.duckdb.connect()
conn_polars = ibis.polars.connect()

conn_duck.register(df, table_name="fruit")
conn_polars.register(df, table_name="fruit")


def pivot_duckdb():
    """Works fine for two categories."""
    t = conn_duck.table("fruit")
    return t.pivot_longer(~s.c("product"))


def pivot_polars_error():
    """ERROR !!!"""
    t = conn_polars.table("fruit")
    return t.pivot_longer(~s.c("product"))


def pivot_polars_ok():
    """Works fine if limited to a single category."""
    t = conn_polars.table("fruit").filter(_["product"] == "apples")
    return t.pivot_longer(~s.c("product"))


if __name__ == '__main__':
    print(pivot_duckdb().execute())  # works OK, two categories
    print(pivot_polars_ok().execute())  # works OK, single category
    print(pivot_polars_error().execute())  # throws error for more than one category

What version of ibis are you using?

5.0.0

What backend(s) are you using, if any?

DuckDB, Polars

Relevant log output

Traceback (most recent call last):
  File "/Users/redacted/PycharmProjects/ibis-demo/bugs/polars_pivot.py", line 41, in <module>
    print(pivot_polars_error().execute())
  File "/Users/redacted/.pyenv/versions/ibis-experiments/lib/python3.10/site-packages/ibis/expr/types/core.py", line 303, in execute
    return self._find_backend(use_default=True).execute(
  File "/Users/redacted/.pyenv/versions/ibis-experiments/lib/python3.10/site-packages/ibis/backends/polars/__init__.py", line 328, in execute
    df = lf.collect()
  File "/Users/redacted/.pyenv/versions/ibis-experiments/lib/python3.10/site-packages/polars/lazyframe/frame.py", line 1443, in collect
    return pli.wrap_df(ldf.collect())
exceptions.ComputeError: series length 2 doesn't match the dataframe height of 4

Code of Conduct

  • I agree to follow this project’s Code of Conduct

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 26 (15 by maintainers)

Most upvoted comments

Ok, awesome! I think this gives me enough to finally be able to address this issue:

In [21]: import polars as pl

In [22]: df = pl.read_csv('/home/cloud/.cache/ibis-framework/6.0.0/world_bank_pop.csv.gz')

In [23]: df.with_columns(year=pl.Series([["2000", "2001"]]), metric=pl.col("x2000").reshape((-1, 1)).list.concat(pl.col("x2001").reshape((-1, 1)))).explode("year", "metric")
Out[23]:
shape: (2_112, 22)
┌─────────┬─────────────┬─────────────┬─────────────┬───┬─────────────┬─────────────┬──────┬─────────────┐
│ country ┆ indicator   ┆ x2000       ┆ x2001       ┆ … ┆ x2016       ┆ x2017       ┆ year ┆ metric      │
│ ---     ┆ ---         ┆ ---         ┆ ---         ┆   ┆ ---         ┆ ---         ┆ ---  ┆ ---         │
│ str     ┆ str         ┆ f64         ┆ f64         ┆   ┆ f64         ┆ f64         ┆ str  ┆ f64         │
╞═════════╪═════════════╪═════════════╪═════════════╪═══╪═════════════╪═════════════╪══════╪═════════════╡
│ ABW     ┆ SP.URB.TOTL ┆ 42444.0     ┆ 43048.0     ┆ … ┆ 45275.0     ┆ 45572.0     ┆ 2000 ┆ 42444.0     │
│ ABW     ┆ SP.URB.TOTL ┆ 42444.0     ┆ 43048.0     ┆ … ┆ 45275.0     ┆ 45572.0     ┆ 2001 ┆ 43048.0     │
│ ABW     ┆ SP.URB.GROW ┆ 1.182632    ┆ 1.413021    ┆ … ┆ 0.655929    ┆ 0.653849    ┆ 2000 ┆ 1.182632    │
│ ABW     ┆ SP.URB.GROW ┆ 1.182632    ┆ 1.413021    ┆ … ┆ 0.655929    ┆ 0.653849    ┆ 2001 ┆ 1.413021    │
│ …       ┆ …           ┆ …           ┆ …           ┆ … ┆ …           ┆ …           ┆ …    ┆ …           │
│ ZWE     ┆ SP.POP.TOTL ┆ 1.2222251e7 ┆ 1.2366165e7 ┆ … ┆ 1.6150362e7 ┆ 1.6529904e7 ┆ 2000 ┆ 1.2222251e7 │
│ ZWE     ┆ SP.POP.TOTL ┆ 1.2222251e7 ┆ 1.2366165e7 ┆ … ┆ 1.6150362e7 ┆ 1.6529904e7 ┆ 2001 ┆ 1.2366165e7 │
│ ZWE     ┆ SP.POP.GROW ┆ 1.298782    ┆ 1.170597    ┆ … ┆ 2.33607     ┆ 2.322864    ┆ 2000 ┆ 1.298782    │
│ ZWE     ┆ SP.POP.GROW ┆ 1.298782    ┆ 1.170597    ┆ … ┆ 2.33607     ┆ 2.322864    ┆ 2001 ┆ 1.170597    │
└─────────┴─────────────┴─────────────┴─────────────┴───┴─────────────┴─────────────┴──────┴─────────────┘

Will report back if I still need some more help!