fluent-bit: Errors when forwarding to ElasticSearch

Bug Report

Describe the bug Getting loads of warning followed immediately by an error in the logs. example below.

I’m running in a Kubernetes cluster in AWS. The cluster is running on top of CoreOS. ElasticSearch cluster is set up with the elasticsearch-operator helm chart To Reproduce

  • Example log message if applicable:
{"log":"[2018/08/31 16:15:33] [ warn] [out_es] Elasticsearch error\n","stream":"stderr","time":"2018-08-31T16:15:33.595296858Z"}
{"log":"{\"took\":2205,\"errors\":true,\"items\":[{\"index\":{\"_index\":\"fbit-2018.08\",\"_type\":\"flb_type\",\"_id\":\"WMvEkGUBflYo7f9bpukw\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":2,\"successful\":2,\"failed\":0},\"_seq_no\":2351097,\"_primary_term\":1,\"status\":201}},{\"index\":{\"_index\":\"fbit-2018.08\",\"_type\":\"flb_type\",\"_id\":\"WcvEkGUBflYo7f9bpukw\",\"status\":429,\"error\":{\"type\":\"es_rejected_execution_exception\",\"reason\":\"rejected execution of org.elasticsearch.transport.TransportService$7@227349ba on EsThreadPoolExecutor[bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@4d3e4d91[Running, pool size = 4, active threads = 4, queued tasks = 202, completed tasks = 210040]]\"}}},{\"index\":{\"_index\":\"fbit-2018.08\",\"_type\":\"flb_type\",\"_id\":\"WsvEkGUBflYo7f9bpukw\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":2,\"successful\":2,\"failed\":0},\"_seq_no\":2354152,\"_primary_term\":1,\"status\":201}},{\"index\":{\"_index\":\"fbit-2018.08\",\"_type\"\n","stream":"stderr","time":"2018-08-31T16:15:33.595299577Z"}
{"log":"[2018/08/31 16:15:37] [error] [out_es] could not pack/validate JSON response\n","stream":"stderr","time":"2018-08-31T16:15:37.652275127Z"}
{"log":"{\"took\":2331,\"errors\":true,\"items\":[{\"index\":{\"_index\":\"fbit-2018.08\",\"_type\":\"flb_type\",\"_id\":\"f8XEkGUBjFzZ4VPRtU-U\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":2,\"successful\":2,\"failed\":0},\"_seq_no\":2356841,\"_primary_term\":1,\"status\":201}},{\"index\":{\"_index\":\"fbit-2018.08\",\"_type\":\"flb_type\",\"_id\":\"g8XEkGUBjFzZ4VPRtU-U\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":2,\"successful\":2,\"failed\":0},\"_seq_no\":2359901,\"_primary_term\":1,\"status\":201}},{\"index\":{\"_index\":\"fbit-2018.08\",\"_type\":\"flb_type\",\"_id\":\"hcXEkGUBjFzZ4VPRtU-U\",\"status\":429,\"error\":{\"type\":\"es_rejected_execution_exception\",\"reason\":\"rejected execution of org.elasticsearch.transport.TransportService$7@68e6d98f on EsThreadPoolExecutor[bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@4d3e4d91[Running, pool size = 4, active threads = 4, queued tasks = 201, completed tasks = 210410]]\"}}},{\"index\":{\"_index\":\"fbi\n","stream":"stderr","time":"2018-08-31T16:15:37.652302381Z"}
  • Steps to reproduce the problem: just deploying the fluentbit helm chart is enough to trigger it.

Expected behavior I’m not expecting to see those errors at all.

Your Environment

  • Version used: 0.14.1 docker image.
  • Configuration:
[SERVICE]                                                                                                                                                                                  
    Flush        1                                                                                                                                                                         
    Daemon       Off                                                                                                                                                                       
    Log_Level    info                                                                                                                                                                      
    Parsers_File parsers.conf                                                                                                                                                              
    HTTP_Server  On                                                                                                                                                                        
    HTTP_Listen  0.0.0.0                                                                                                                                                                   
    HTTP_Port    2020                                                                                                                                                                      
                                                                                                                                                                                            
[INPUT]                                                                                                                                                                                    
    Name             tail                                                                                                                                                                  
    Path             /var/log/containers/*.log                                                                                                                                             
    Parser           docker                                                                                                                                                                
    Tag              kube.*
    Refresh_Interval 5
    Mem_Buf_Limit    5MB
    Skip_Long_Lines  On

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    Merge_JSON_Log      On
    K8S-Logging.Parser  On
    K8S-Logging.Exclude On

[OUTPUT]
    Name  es
    Match *
    Host  elasticsearch-logs-cluster
    Port  9200
    Logstash_Format On
    Retry_Limit False
    Type  flb_type
    Logstash_Prefix fbit
    Logstash_DateFormat %Y.%m


    tls on
    tls.verify off
    tls.debug 1

  • Environment name and version (e.g. Kubernetes? What version?): Kubernetes 1.11.2
  • Server type and version: AWS EC2 m5.xLarge
  • Operating System and version: CoreOS
  • Filters and plugins:
    • Kubernetes

Additional context

Logs are clearly making it into elasticsearch, but there is far too much error logging from fluentbit that I fear I’ll miss something important.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 4
  • Comments: 17 (2 by maintainers)

Commits related to this issue

Most upvoted comments

@edsiper I’ve got the same problem, also related to es_rejected_execution_exception but:

  1. It should still properly parse the reponse, right? Or not log raise this could not pack/validate JSON response. That still seems like a bug to me? It’s at least misleading
  2. I also realize it somehow seems to create several loglines for the same problem with incomplete “reason” field. It looks like something is flushing the the error message while it’s being written?
[2018/09/06 15:31:53] [error] [out_es] could not pack/validate JSON response
{"took":7071,"errors":true,"items":[{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"06db0050-ddd0-8b63-9c0b-f6a7737a30d0","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":143348,"_primary_term":1,"status":201}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"49a15886-7544-0231-9800-a1cb4880e928","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":143349,"_primary_term":1,"status":201}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"e8b01818-0cd6-943e-0df6-39b21e176cc1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":158263,"_primary_term":1,"status":201}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"99f70ae1-9244-6d29-cb98-09e919ae8466","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of org
[2018/09/06 15:31:53] [ warn] [out_es] Elasticsearch error
{"took":7071,"errors":true,"items":[{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"06db0050-ddd0-8b63-9c0b-f6a7737a30d0","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":143348,"_primary_term":1,"status":201}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"49a15886-7544-0231-9800-a1cb4880e928","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":143349,"_primary_term":1,"status":201}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"e8b01818-0cd6-943e-0df6-39b21e176cc1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":158263,"_primary_term":1,"status":201}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"99f70ae1-9244-6d29-cb98-09e919ae8466","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of org.elasticsearch.tra
[2018/09/06 15:31:56] [error] [out_es] could not pack/validate JSON response
{"took":7650,"errors":true,"items":[{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"19d3893a-a3bd-7647-b17f-bb2d2c6cf435","_version":2,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":159800,"_primary_term":1,"status":200}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"aec6b1d5-bb6a-769b-f2ba-a25cf4a2411c","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of org.elasticsearch.transport.TransportService$7@783bbc64 on EsThreadPoolExecutor[name = PJ1CCI0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@7b06924b[Running, pool size = 2, active threads = 2, queued tasks = 200, completed tasks = 5248570]]"}}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"736d5f6c-07d0-ba05-31c4-4156a2e7fedb","status":429,"error":{"type":"es_rejected_execution_exception","reason":"re
[2018/09/06 15:31:56] [ warn] [out_es] Elasticsearch error
{"took":7650,"errors":true,"items":[{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"19d3893a-a3bd-7647-b17f-bb2d2c6cf435","_version":2,"result":"updated","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":159800,"_primary_term":1,"status":200}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"aec6b1d5-bb6a-769b-f2ba-a25cf4a2411c","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of org.elasticsearch.transport.TransportService$7@783bbc64 on EsThreadPoolExecutor[name = PJ1CCI0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@7b06924b[Running, pool size = 2, active threads = 2, queued tasks = 200, completed tasks = 5248570]]"}}},{"index":{"_index":"fluent-2018.09.04","_type":"flb_type","_id":"736d5f6c-07d0-ba05-31c4-4156a2e7fedb","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution o

See how the lines are cut off and there are two per _id. Not sure if this is just a minor issue with the stdout log (this is running on k8s).

@jorgebirck setting Merge_Log Off will not work if you output logs in JSON and want those logs appear as map fields as part of the log structure. Should this issue be re-opened?

@discordianfish Were you able to solve this log chomping off problem? Seems like we are also having the same issue and without the full logs is a nightmare to debug.

Having the same issue. Did anyone find a solution? Here is some snippet from my logs:

[2019/03/25 07:56:14] [error] [out_es] could not pack/validate JSON response
{"took":28776,"errors":true,"items":[{"index":{"_index":"kubernetes_cluster-2019.03.24","_type":"flb_type","_id":"ahLZs2kB0jAMgztJI1LW","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":4193,"_primary_term":1,"status":201}},{"index":{"_index":"kubernetes_cluster-2019.03.24","_type":"flb_type","_id":"axLZs2kB0jAMgztJI1LW","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":4194,"_primary_term":1,"status":201}},{"index":{"_index":"kubernetes_cluster-2019.03.24","_type":"flb_type","_id":"bBLZs2kB0jAMgztJI1LW","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":3098,"_primary_term":1,"status":201}},{"index":{"_index":"kubernetes_cluster-2019.03.24","_type":"flb_type","_id":"bRLZs2kB0jAMgztJI1LW","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":4195,"_primary_term":1,"statu
[2019/03/25 07:56:14] [error] [out_es] could not pack/validate JSON response
{"took":28802,"errors":true,"items":[{"index":{"_index":"kubernetes_cluster-2019.03.24","_type":"flb_type","_id":"wxLZs2kB0jAMgztJI1Lg","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [382790][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[kubernetes_cluster-2019.03.24][1]] containing [49] requests, target allocation id: a-l8Iob1Q8i2IRhb0bG7pA, primary term: 1 on EsThreadPoolExecutor[name = elasticsearch-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@4dff4545[Running, pool size = 1, active threads = 1, queued tasks = 251, completed tasks = 119649]]"}}},{"index":{"_index":"kubernetes_cluster-2019.03.24","_type":"flb_type","_id":"xBLZs2kB0jAMgztJI1Lg","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":4217,"_primary_term":1,"status":201}},{"index":{"_index":"kubernetes
[2019/03/25 07:56:15] [error] [out_es] could not pack/validate JSON response
{"took":27401,"errors":true,"items":[{"index":{"_index":"kubernetes_cluster-2019.03.20","_type":"flb_type","_id":"hBLZs2kB0jAMgztJK1bF","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":612,"_primary_term":1,"status":201}},{"index":{"_index":"kubernetes_cluster-2019.03.20","_type":"flb_type","_id":"hRLZs2kB0jAMgztJK1bF","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [382799][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[kubernetes_cluster-2019.03.20][2]] containing [6] requests, target allocation id: DeEShp0BTgiT53qWpenqFg, primary term: 1 on EsThreadPoolExecutor[name = elasticsearch-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@4dff4545[Running, pool size = 1, active threads = 1, queued tasks = 249, completed tasks = 119661]]"}}},{"index":{"_index":"kubernetes_c
[2019/03/25 07:56:16] [error] [out_es] could not pack/validate JSON response
{"took":27116,"errors":true,"items":[{"index":{"_index":"kubernetes_cluster-2019.03.21","_type":"flb_type","_id":"ehLZs2kB0jAMgztJL1e7","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":2389,"_primary_term":1,"status":201}},{"index":{"_index":"kubernetes_cluster-2019.03.21","_type":"flb_type","_id":"exLZs2kB0jAMgztJL1e7","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [382814][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[kubernetes_cluster-2019.03.21][2]] containing [110] requests, target allocation id: lJsxHvFhQkyTtC5f2f2ePw, primary term: 1 on EsThreadPoolExecutor[name = elasticsearch-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@4dff4545[Running, pool size = 1, active threads = 1, queued tasks = 251, completed tasks = 119667]]"}}},{"index":{"_index":"kubernete
[2019/03/25 07:56:16] [error] [out_es] could not pack/validate JSON response
{"took":27128,"errors":true,"items":[{"index":{"_index":"kubernetes_cluster-2019.03.24","_type":"flb_type","_id":"FhLZs2kB0jAMgztJL1e7","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [382812][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[kubernetes_cluster-2019.03.24][1]] containing [22] requests, target allocation id: a-l8Iob1Q8i2IRhb0bG7pA, primary term: 1 on EsThreadPoolExecutor[name = elasticsearch-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@4dff4545[Running, pool size = 1, active threads = 1, queued tasks = 252, completed tasks = 119666]]"}}},{"index":{"_index":"kubernetes_cluster-2019.03.24","_type":"flb_type","_id":"FxLZs2kB0jAMgztJL1e7","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":3230,"_primary_term":1,"status":201}},{"index":{"_index":"kubernetes
[2019/03/25 07:56:16] [error] [out_es] could not pack/validate JSON response
{"took":26214,"errors":true,"items":[{"index":{"_index":"kubernetes_cluster-2019.03.25","_type":"flb_type","_id":"RBLZs2kB0jAMgztJM1me","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [382817][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[kubernetes_cluster-2019.03.25][2]] containing [29] requests, target allocation id: vkwRPSugTa6yvWWn_8jhXg, primary term: 1 on EsThreadPoolExecutor[name = elasticsearch-0/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@4dff4545[Running, pool size = 1, active threads = 1, queued tasks = 253, completed tasks = 119669]]"}}},{"index":{"_index":"kubernetes_cluster-2019.03.25","_type":"flb_type","_id":"RRLZs2kB0jAMgztJM1me","status":429,"error":{"type":"es_rejected_execution_exception","reason":"rejected execution of processing of [382817][indices:data/write/bulk[s][p]]: request: