monstache: Monstache missing some documents to sync

We have two Monstache instances deployed in EKS cluster. Each instance is independently deployed.

Instance 1 - Monstache configuration:

elasticsearch-max-conns = 12
elasticsearch-max-bytes = 8000000
gzip = true
direct-read-split-max = 9
resume = true
resume-strategy = 0
resume-name = "default"	
namespace-exclude-regex = '^.*DB\.(classification|collection|publication).*$'
verbose = false
index-oplog-time = true
oplog-date-field-format = "2006-01-02T15:04:05.999Z"
[gtm-settings]
      buffer-size = 128
      channel-size = 512
      buffer-duration = "75ms"

Instance 2 - Monstache configuration: Same as instance 1 config except instead of namespace-exclude-regex configured the following

namespace-regex = '^.*DB\.(classification|collection|publication).*$'

The idea is - Instance 2 will index the documents from configured collections and Instance 1 will index from the rest of the collections.

Everything works fine for a while, but after that, we are seeing discrepancies between the MongoDB collection document count vs Elastic Index count.

Notes:

  1. Mongo oplog has the records for the missing documents, earlier we had issue with missing oplog records.
  2. Checked monstache.monstache collection for resume timestamp, it is greater than the oplog record ts
  3. No errors found in the Monstache or Elastic
  4. Cannot confirm but, it may be happening when bulk inserted the documents into Mongo, some thing like below image

Can you please advise?

Thank you for your help

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 15 (8 by maintainers)

Most upvoted comments

@chandra2037 FYI I just pushed a commit to check for empty _id and report an error instead of attempting to index/delete the document.