monstache: [BUG] Monstache stops tailing after a period of time.

Describe the bug I have been using Monstache on a collection in my database. Monstache stops tailing after some period with no error in logs by Monstache. This collection has a average document size of 20KB and 50k documents in total. The oplogs generated for this collection are high in volume. When I see logs on my mongos server, I see a error “BSONObjectTooLarge” corresponding to the aggregation query made by Monstache. Subsequently I see “Resume point not in the oplog” error after some time. I am suspecting the default batch size of 512 is causing the issue in this scenario.

To Reproduce

Monstache configuration: exit-after-direct-reads = false dropped-collections = true replay = false stats = true elasticsearch-retry = true direct-read-no-timeout = true

# describe config here

I tried with using Resume and replay configs too. That didnot work either. Steps to reproduce the behavior:

Expected behavior A clear and concise description of what you expected to happen.

Software information (please complete the following information):

  • Operating System: Ubuntu
  • Monstache Version:5.0.11
  • MongoDB Version:4.0
  • Elasticsearch Version:6.8
  • Docker Version:

Additional context Add any other context about the problem here. I think the default batch size of 512 is generating a BSON object that exceeds 16MB. Providing a config to adjust batch size might help. Please help me with any work arounds.

Screenshots If applicable, add screenshots to help explain your problem.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 19 (8 by maintainers)

Most upvoted comments

@Zjcompt thanks for the info. I have some ideas on how to address this in the next release.

The batch size of the change stream will not be tied to the channel-size setting going forward. Also, I have added more error handling so hopefully it can recover from the errors you posted.

The code has been pushed I just need to cut a new release.