vega-lite: Cannot create stacked bar chart without aggregate

I cannot seem to create a stacked chart with data that I have already pre-aggregated outside of vega-lite. I don’t know if this is by design or a bug. All examples of stacked bar/area charts I can find use some kind of aggregate in the y channel.

But the docs don’t mention this. From the docs on Stacked Bar Chart:

Adding color to the bar chart (by using the color attribute) creates a stacked bar chart by default."

Here is a simple spec with data that I would expect to create a stacked bar chart:

{
  "width": 300,
  "height": 300,
  "data": {
    "values": [
      {
        "dayOfWeek": 0,
        "payment_type": "Credit Card",
        "tips": 53
      },
      {
        "dayOfWeek": 0,
        "payment_type": "Cash",
        "tips": 93
      },
      {
        "dayOfWeek": 0,
        "payment_type": "No Charge",
        "tips": 4
      },
      {
        "dayOfWeek": 1,
        "payment_type": "Credit Card",
        "tips": 48
      },
      {
        "dayOfWeek": 1,
        "payment_type": "Cash",
        "tips": 71
      },
      {
        "dayOfWeek": 1,
        "payment_type": "No Charge",
        "tips": 1
      },
      {
        "dayOfWeek": 2,
        "payment_type": "Credit Card",
        "tips": 49
      },
      {
        "dayOfWeek": 2,
        "payment_type": "Cash",
        "tips": 77
      },
      {
        "dayOfWeek": 2,
        "payment_type": "No Charge",
        "tips": 1
      }
    ]
  },
  "mark": "bar",
  "encoding": {
    "x": {
      "field": "dayOfWeek",
      "type": "nominal"
    },
    "y": {
      "field": "tips",
      "type": "quantitative"
    },
    "color": {
      "field": "payment_type",
      "type": "nominal"
    }
  }
}

I’m getting this chart:

screen shot 2017-06-28 at 17 16 14

However if I add a aggregate: "sum" to the y channel (which is a no-op as all the combinations of payment_type and dayOfWeek are unique), it shows the expected result:

screen shot 2017-06-28 at 17 18 12

This seems to be an easy enough workaround, but I’d like to check if this is by design, in which case it would be great to document this behavior, e.g. “for stacked charts, you always need an aggregate”.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 1
  • Comments: 15 (11 by maintainers)

Most upvoted comments

Yes, as discussed above this is fixed in #3070. If you look into #3070, you would see that we add stacked_bar_unaggregate example in the PR.

Preaggregated and prebinned data will be my focus after the 2.0 release. I agree that it’s super important for scalability.

On Sun, Oct 1, 2017, 10:42 Tom Crockett notifications@github.com wrote:

Our company also preaggregates the data in a database, I think this might be quite a common pattern in industrial-scale situations.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vega/vega-lite/issues/2595#issuecomment-333393366, or mute the thread https://github.com/notifications/unsubscribe-auth/AAj86rqlFRQwTDkRS-xH1yTdU1jtBw6jks5sn88agaJpZM4OHlVW .

Thanks for the quick answer. I can use that work-around for now, and manually overwrite the axis label to get rid of the SUM(...). FYI, this is for MongoDB Charts, and we are planning to do all aggregations directly on the database server (via the aggregation framework) for performance reasons and just pass the results into vega-lite, so that’s why it is pre-aggregated. 😃