monstache: Monstache missing some documents to sync
We have two Monstache instances deployed in EKS cluster. Each instance is independently deployed.
Instance 1 - Monstache configuration:
elasticsearch-max-conns = 12
elasticsearch-max-bytes = 8000000
gzip = true
direct-read-split-max = 9
resume = true
resume-strategy = 0
resume-name = "default"
namespace-exclude-regex = '^.*DB\.(classification|collection|publication).*$'
verbose = false
index-oplog-time = true
oplog-date-field-format = "2006-01-02T15:04:05.999Z"
[gtm-settings]
buffer-size = 128
channel-size = 512
buffer-duration = "75ms"
Instance 2 - Monstache configuration: Same as instance 1 config except instead of namespace-exclude-regex configured the following
namespace-regex = '^.*DB\.(classification|collection|publication).*$'
The idea is - Instance 2 will index the documents from configured collections and Instance 1 will index from the rest of the collections.
Everything works fine for a while, but after that, we are seeing discrepancies between the MongoDB collection document count vs Elastic Index count.
Notes:
- Mongo oplog has the records for the missing documents, earlier we had issue with missing oplog records.
- Checked monstache.monstache collection for resume timestamp, it is greater than the oplog record ts
- No errors found in the Monstache or Elastic
- Cannot confirm but, it may be happening when bulk inserted the documents into Mongo, some thing like below

Can you please advise?
Thank you for your help
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 15 (8 by maintainers)
@chandra2037 FYI I just pushed a commit to check for empty
_idand report an error instead of attempting to index/delete the document.