influxdb: GROUP BY SLIMIT and SOFFSET returns unpredictable results
After working through a support case with Sean Beckett, he asked me to open up a case in github. We have not been able to come up with a simple replication, my measurement isn’t that complicated.
Seeing this problem with both 12.2 and 13.
Basically we have a large amount of data being returned, we encountered the row limit via http of 10K rows. So we started to use the SLIMIT and SOFFSET to page through the data. The results when using SLIMT and SOFFSET are all over the place and are completely unpredictable. The same behavior was noticed with the CLI client.
We are inserting via influx-java client with a non-default retention policy. I do not believe this is related to the “insert into” issue reported in https://github.com/influxdata/influxdb/issues/6711 as we are not using the “into”.
OK, here are the details… The basic group by appears to work fine, it always returns data, but when you add the SLIMIT AND SOFFSET, the results are all over the place.
I am seeing some really strange behavior when using the time > now() - 15m with the SLIMIT AND SOFFSET.
Without the SLIMIT and SOFFSET I get around 45 deveui values returned in this last 15 minutes.
> select seqno from evt_rp.evt where time > now() - 15m group by deveui
name: evt
tags: deveui=AABBCC0020002199
time seqno
---- -----
1464108400539165696 3290
name: evt
tags: deveui=AABBCC00200022F8
time seqno
---- -----
1464108614692312320 5665
name: evt
tags: deveui=AABBCC002000232B
time seqno
---- -----
1464108507834923520 5300
name: evt
tags: deveui=AABBCC0020003F54
time seqno
---- -----
1464108787483994880 490
name: evt
tags: deveui=AABBCC002000401B
time seqno
---- -----
1464108160006276608 4706
name: evt
tags: deveui=AABBCC00200042AE
time seqno
---- -----
1464108166697809408 4706
name: evt
tags: deveui=AABBCC00200045C8
time seqno
---- -----
1464108113667539712 3022
name: evt
tags: deveui=AABBCC00200048B8
time seqno
---- -----
1464108446000000000 2304
(truncated)
OK good, got data, now run with a SLMIT and SOFFSET. Only the 3’rd query returns data. How can I have data at SLIMIT 10 and SOFFSET 10, but not at SLIMIT 1 SOFFSET 1??? And why only 3 devices returned when I know there are ~45 in this time period???
> select seqno from evt_rp.evt where time > now() - 15m group by deveui SLIMIT 1 SOFFSET 1
> select seqno from evt_rp.evt where time > now() - 15m group by deveui SLIMIT 10 SOFFSET 1
> select seqno from evt_rp.evt where time > now() - 15m group by deveui SLIMIT 10 SOFFSET 10
name: evt
tags: deveui=AABBCC0020002199
time seqno
---- -----
1464108400539165696 3290
name: evt
tags: deveui=AABBCC00200022F8
time seqno
---- -----
1464108614692312320 5665
name: evt
tags: deveui=AABBCC002000232B
time seqno
---- -----
1464108507834923520 5300
Here are a couple more oddities.
I have an entry at position 2 (slimit 1 soffset 1), but NOT at position 1 (slimit 1 soffset 0)
> select seqno from evt_rp.evt where time > now() - 15m group by deveui SLIMIT 1 SOFFSET 1
name: evt
tags: deveui=AABBCC0020001F7B
time seqno
---- -----
2016-05-24T17:27:04Z 3193
> select seqno from evt_rp.evt where time > now() - 15m group by deveui SLIMIT 1 SOFFSET 0
> select seqno from evt_rp.evt where time > now() - 15m group by deveui SLIMIT 1 SOFFSET 1
name: evt
tags: deveui=AABBCC0020001F7B
time seqno
---- -----
2016-05-24T17:27:04Z 3193
Nothing with SLIMIT 1, but SLIMIT2 return a single entry. Seriously???
> select seqno from evt_rp.evt where time > now() - 15m group by deveui SLIMIT 1 SOFFSET 0
> select seqno from evt_rp.evt where time > now() - 15m group by deveui SLIMIT 2 SOFFSET 0
name: evt
tags: deveui=AABBCC0020001F7B
time seqno
---- -----
2016-05-24T17:27:04Z 3193
At this point I am not trusting any of the data that is returned because I cannot determine a pattern, so I changed the query around to be for a specific range (time > ‘2016-05-23’ and time < ‘2016-05-24’) because I didn’t want the possibility of a sliding time of now() to return different data
> select seqno from evt_rp.evt where time > '2016-05-23' and time < '2016-05-24' group by deveui SLIMIT 1 SOFFSET 0
name: evt
tags: deveui=AABBCC0020001F78
time seqno
---- -----
2016-05-23T20:38:13.928824832Z 1770
2016-05-23T21:38:17.404920576Z 1771
OK GOOD, returned 1 device “1F78”
> select seqno from evt_rp.evt where time > '2016-05-23' and time < '2016-05-24' group by deveui SLIMIT 2 SOFFSET 0
name: evt
tags: deveui=AABBCC0020001F78
time seqno
---- -----
2016-05-23T20:38:13.928824832Z 1770
2016-05-23T21:38:17.404920576Z 1771
name: evt
tags: deveui=AABBCC0020001F7B
time seqno
---- -----
2016-05-23T20:26:04.72924544Z 3172
2016-05-23T21:26:08.415660032Z 3173
2016-05-23T22:26:10.77421696Z 3174
OK GOOD, returned 2 devices “1F78”, and “1F7B”
> select seqno from evt_rp.evt where time > '2016-05-23' and time < '2016-05-24' group by deveui SLIMIT 2 SOFFSET 1
name: evt
tags: deveui=AABBCC0020001F7B
time seqno
---- -----
2016-05-23T20:26:04.72924544Z 3172
2016-05-23T21:26:08.415660032Z 3173
2016-05-23T22:26:10.77421696Z 3174
name: evt
tags: deveui=AABBCC0020001F9A
time seqno
---- -----
2016-05-23T20:09:31.134389248Z 2355
2016-05-23T21:09:33.47566464Z 2356
2016-05-23T22:09:27Z 2357
Nice, returned 2 devices and used the soffset of 1 “1F7B”, and “1F9A” to give me the second device.
Here is a what our schema looks like tags and fields.
> show tag keys from evt_rp.evt
name: evt
-------------
tagKey
bstneui
deveui
> show field keys from evt_rp.evt
name: evt
-------------
fieldKey
billable
bstneui_f
chnl
deveui_f
downpdusz
dr
freq
joinid
mic
msg
pdu
port
rssi
seqno
snr
type
uppdusz
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 36 (18 by maintainers)
We are reaching 2 year mark with this bug, and need for paginating through large chunk of data is very basic once the data grows beyond certain size. Even with latest Influx, I am getting more than one series with
GROUP BY * LIMIT 1 SLIMIT 1Sorry for the delayed response, but my previous comment was saying that this is intended behavior. When you say
SLIMIT 1, you are saying, “Limit me to one series.” It then grabs that series and that series has no points, so it doesn’t return anything.I’m marking this as something for us to revisit in the future so I can bring it up with other members of the team and see if they have another opinion of this.