influxdb: Subqueries run much slower for simple SELECT statement

Bug report

System info: [Include InfluxDB version, operating system name, and other relevant details]

Amazon Linux InfluxDB 1.3.6

Steps to reproduce:

  1. Run a query counting and averaging response times:
> SELECT mean(response_time),count(response_time) FROM fastly_logs WHERE time > now() - 1m GROUP BY account_id
name: fastly_logs
tags: account_id=3855502814001
time                         mean   count
----                         ----   -----
2017-11-16T14:03:13.0021277Z 0.0194 5

name: fastly_logs
tags: account_id=4603804221001
time                         mean  count
----                         ----  -----
2017-11-16T14:03:13.0021277Z 0.642 1

name: fastly_logs
tags: account_id=5102072605001
time                         mean  count
----                         ----  -----
2017-11-16T14:03:13.0021277Z 0.016 3

name: fastly_logs
tags: account_id=5130362111001
time                         mean                count
----                         ----                -----
2017-11-16T14:03:13.0021277Z 0.04241873600155608 10269

name: fastly_logs
tags: account_id=5219833342001
time                         mean                count
----                         ----                -----
2017-11-16T14:03:13.0021277Z 0.05869871794871796 156

name: fastly_logs
tags: account_id=5330942432001
time                         mean                count
----                         ----                -----
2017-11-16T14:03:13.0021277Z 0.35833333333333334 6

...
...
  1. Run the same query but within a subquery, selecting some or all elements of the query:
> SELECT mean,count FROM (SELECT mean(response_time),count(response_time) FROM fastly_logs WHERE time > now() - 1m GROUP BY account_id)
name: fastly_logs
time                           mean                 count
----                           ----                 -----
2017-11-16T14:03:35.169381598Z 0.015                8
2017-11-16T14:03:35.169381598Z 0.11424999999999999  5
2017-11-16T14:03:35.169381598Z 0.0625               9
2017-11-16T14:03:35.169381598Z 0.0315               8
2017-11-16T14:03:35.169381598Z 0.23199999999999998  6
2017-11-16T14:03:35.169381598Z                      2
2017-11-16T14:03:35.169381598Z 0.012                1
2017-11-16T14:03:35.169381598Z 0.042096663600633646 18626
2017-11-16T14:03:35.169381598Z 0.05981645569620251  228
2017-11-16T14:03:35.169381598Z 0.4050000000000001   7
2017-11-16T14:03:35.169381598Z 0.004                8
2017-11-16T14:03:35.169381598Z 0.016                4
2017-11-16T14:03:35.169381598Z 0.3765               3
2017-11-16T14:03:35.169381598Z 0.057                5

Expected behavior: [What you expected to happen]

I would expect these queries to take about the same amount of time

Actual behavior: [What actually happened]

the 2nd query takes around 30-60s whereas the first returns almost instantly

Additional info: [Include gist of relevant config, logs, etc.]

I will attach vars.txt and iostat.txt in a comment

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Reactions: 1
  • Comments: 21 (4 by maintainers)

Commits related to this issue

Most upvoted comments

We have seen this problem too, but I found that by moving the time constraint from the inner to the outer select, the query would complete much faster, and as expected. In this example I’m wanting to count the number of groups that contain the tested values.

Time in inner select ran in 37 seconds

> time influx -database testdb -execute "select count(sum) from (select sum(util_pc) from host_id where group = 'devel' and name =~ /testing/ and type = 'static' and util_pc > 50 and time > now() - 15m group by ident)"
name: host_id
time count
---- -----
0    15
0.008u 0.004s 0:37.24 0.0%	0+0k 0+0io 0pf+0w

Time in outer select ran in 2 seconds

% time influx -database testdb -execute "select count(sum) from (select sum(util_pc) from host_id where group = 'devel' and name =~ /testing/ and type = 'static' and util_pc > 50 group by ident) where time > now() - 15m"
name: host_id
time                count
----                -----
1516125703050401501 15
0.004u 0.008s 0:02.19 0.0%	0+0k 0+0io 0pf+0w

Original single query ran in 2 seconds

> time influx -database testdb -execute "select sum(util_pc) from host_id where group = 'devel' and name =~ /testing/ and type = 'static' and util_pc > 50 and time > now() - 15m group by ident" 
name: host_id
tags: ident=82459165
time                sum
----                ---
1516125638290016167 199.79999999999995

name: host_id
tags: ident=82459821
time                sum
----                ---
1516125638290016167 399.69999999999993

name: host_id
tags: ident=82459927
time                sum
----                ---
1516125638290016167 400.0999999999999

name: host_id
tags: ident=82460499
time                sum
----                ---
1516125638290016167 399.69999999999993

name: host_id
tags: ident=82460500
time                sum
----                ---
1516125638290016167 372.3999999999999

name: host_id
tags: ident=82460501
time                sum
----                ---
1516125638290016167 341.69999999999993

name: host_id
tags: ident=82460502
time                sum
----                ---
1516125638290016167 268.19999999999993

name: host_id
tags: ident=82460505
time                sum
----                ---
1516125638290016167 399.69999999999993

name: host_id
tags: ident=82460506
time                sum
----                ---
1516125638290016167 400.0999999999999

name: host_id
tags: ident=82460507
time                sum
----                ---
1516125638290016167 400.19999999999993

name: host_id
tags: ident=82460508
time                sum
----                ---
1516125638290016167 399.5999999999999

name: host_id
tags: ident=82527534
time                sum
----                ---
1516125638290016167 393.20000000000005

name: host_id
tags: ident=82527535
time                sum
----                ---
1516125638290016167 393.4
0.004u 0.008s 0:02.36 0.0%	0+0k 0+0io 0pf+0w