MeshCentral: Server peering failure

Attempting to configure server peering fails with the following configuration:

  "peers": {
    "serverId": "asrss01",
    "servers": {
      "asrss01": { "url": "ws://1.2.3.4:4430/" },
      "asrss02": { "url": "ws://5.6.7.8:4430/" },
      "ocrss01": { "url": "ws://9.10.11.12:4430/" }
    }
  },

MeshCentral logs an error as such:

node[2612]: Error: Unable to peer with other servers, "http-1" not present in peer servers list.

http-1 is actually the hostname of the server in question. I recalled in the documentation that MeshCentral will automatically failover to the hostname should the serverId option not be configured. I went digging through multiserver.js.

When debugging, it appears that there is an error whereby the case sensitivity over serverId is ignored when the config.json is parsed, and therefore the serverId check in multiserver.js seems to be failing (please excuse my poor JavaScript skills, I’m unsure if JSON keys are case sensitive, but a quick Google seemed to indicate that they are).

I tweaked the logic in multiserver.js for a lowercase serverid as follows:

    // If we have no peering configuration, don't setup this object
    if (obj.peerConfig == null) { return null; }
    obj.serverid = obj.parent.config.peers.serverid;

But this didn’t help much as I was greated with a whole array of failures subsequently:

Aug 12 02:55:57 http-1 node[3073]: ERROR: MeshCentral failed with critical error, check MeshErrors.txt. Restarting in 5 seconds...
Aug 12 02:55:57 http-1 node[3073]:    '/usr/bin/node /home/app-meshcentral/node_modules/meshcentral --launch 3073' }
Aug 12 02:55:57 http-1 node[3073]:   cmd:
Aug 12 02:55:57 http-1 node[3073]:   signal: null,
Aug 12 02:55:57 http-1 node[3073]:   code: 1,
Aug 12 02:55:57 http-1 node[3073]:   killed: false,
Aug 12 02:55:57 http-1 node[3073]:     at Process.ChildProcess._handle.onexit (internal/child_process.js:266:5)
Aug 12 02:55:57 http-1 node[3073]:     at maybeClose (internal/child_process.js:999:16)
Aug 12 02:55:57 http-1 node[3073]:     at ChildProcess.emit (events.js:198:15)
Aug 12 02:55:57 http-1 node[3073]:     at ChildProcess.exithandler (child_process.js:299:12)
Aug 12 02:55:57 http-1 node[3073]:     at Socket.Readable.push (_stream_readable.js:231:10)
Aug 12 02:55:57 http-1 node[3073]:     at readableAddChunk (_stream_readable.js:276:11)
Aug 12 02:55:57 http-1 node[3073]:     at addChunk (_stream_readable.js:295:12)
Aug 12 02:55:57 http-1 node[3073]:     at Socket.emit (events.js:193:13)
Aug 12 02:55:57 http-1 node[3073]:     at Socket.socketOnData (_http_client.js:480:11)
Aug 12 02:55:57 http-1 node[3073]:     at ClientRequest.emit (events.js:193:13)
Aug 12 02:55:57 http-1 node[3073]:     at ClientRequest.req.on (/home/app-meshcentral/node_modules/ws/lib/websocket.js:665:15)
Aug 12 02:55:57 http-1 node[3073]:     at WebSocket.setSocket (/home/app-meshcentral/node_modules/ws/lib/websocket.js:170:10)
Aug 12 02:55:57 http-1 node[3073]:     at WebSocket.emit (events.js:193:13)
Aug 12 02:55:57 http-1 node[3073]:     at WebSocket.<anonymous> (/home/app-meshcentral/node_modules/meshcentral/multiserver.js:77:10>
Aug 12 02:55:57 http-1 node[3073]: TypeError: obj.ws._socket.getPeerCertificate is not a function
Aug 12 02:55:57 http-1 node[3073]:                                                                                                  >
Aug 12 02:55:57 http-1 node[3073]:                 var serverCert = obj.forge.pki.certificateFromAsn1(obj.forge.asn1.fromDer(obj.ws.>
Aug 12 02:55:57 http-1 node[3073]: /home/app-meshcentral/node_modules/meshcentral/multiserver.js:77
Aug 12 02:55:57 http-1 node[3073]: the options [useNewUrlParser] is not supported

Note: The raw log file above is organised from most recent log line to least recent log line. Sorry for the confusion there.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 20 (13 by maintainers)

Most upvoted comments

I saw your reported issue and published MeshCentral v0.3.9-y with server peering fixes, should work now.

Before you update or run “npm install meshcentral” again, you should go in “node_modules” and remove “mongodb”, “mongo-core” and “mongojs” folders to force these 3 to update or better yet, just rename or delete the entire “node_modules” folder and install MeshCentral v0.3.9-y or better.

It should work now. Let me know what you see.

Details: The “obj.file.watch is not a function” problem is because the “mongodb” module in the “node_modules” is a really old version that is pulled with “mongojs” that I don’t use anymore. In v0.3.9-y I removed the dependency on “mongojs” and so, the old “mongodb” should no longer be installed. I also made a bunch more fixes as a result of setting up two different computers on the MongoDB and testing this setup again.

You got the perfect setup for server peering. By the way, peering will require the MongoDB ChangeStream which requires a replicaset. In upcoming versions I will require that the MongoDB ChangeStream option be true when doing peering because this is how all the servers will be synchronized to the database.

With a little bit of work and testing, I should be able to get basic peering working. Would be great if you did testing on it and filed issues. Would certainly motivate me to move that feature along.

Sure, no worries. I’m available for assistance with debugging / running tests.

To give you an overview of what I’m wanting to achieve, I’m setting up a multi-datacenter MeshCentral cluster (most probably very much an overkill for my requirements, but I’d rather the system be able to handle many multiple failures).

As such the topology is as follows:

  1. mesh.example.com DNS referring to anycast IP address 1.2.3.4
  2. Front facing instances (3 in total) hosted on 1.2.3.4 are NGINX with TLS offload for WWW and MPS. Each NGINX host has it’s own Meshcentral instance.
  3. Meshcentral configured via MongoDb replicaSet for high availability.
  4. Server peering, with configuration referencing the relative unicast IP addresses of the cluster’s members.

Expectations:

  1. All agents/Intel AMT devices connect to mesh.example.com. Given anycast IP address, should connect to it’s own datacenter’s Meshcentral instance.
  2. Able to logon to mesh.example.com from any data center and access all Intel AMT devices / agents across all datacenters.
  3. Able to failover n - 1 meshcentral instances and therefore still have Meshcentral operative as anycast IP address route advertisement will be removed as instances failover.

I’ve got all the anycast routing etc happening. Just the peering to go!

NB: Anycast IP address 1.2.3.4 referenced here are just for the purposes of demonstrating an IP address. It’s not related at all to the aforementioned server peer URLs in MeshCentral’s config.json.

Oh my. I did early test of server peering a long time ago to make sure it could be supported at some point, but I have not tested it recently and I know there are going to be a bunch of things I need to fix for peering to work correctly. For one, I know not all server operations have the peering code written. So, it’s like a not currently supported feature, however, it’s certainly something I want to add. I will do some work to fix the big issues like this for sure.