accumulo: tserver could not complete shutdown
Doing a cluster shutdown, there was one tserver that would not complete. Its log was spamming the following: [tablet.Tablet] DEBUG: Waiting to completed Close for !0;… , 0 writes 21 scans [clientImp.ThriftScanner] DEBUG: Error getting transport to node:9997 : null
This was simply repeating, and the <node>
in the error message was itself!
Accumulo 2.1.1-SNAPSHOT CentOS 7.3
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 19 (19 by maintainers)
Commits related to this issue
- TabletGroupWatcher update for servers being shutdown (#3368) — committed to dtspence/accumulo by dtspence a year ago
- TabletGroupWatcher update to handle servers being shutdown (#3368) * Updates the TabletGroupWatcher to remove the servers being shutdown from being used for assignment. The change restores previo... — committed to dtspence/accumulo by dtspence a year ago
- Fix to stop assignments to shutting down servers (#3479) * Updates the TabletGroupWatcher to remove the servers being shutdown from being used for assignment. The change restores previous logic... — committed to apache/accumulo by dtspence a year ago
- Fix to stop assignments to shutting down servers (#3479) * Updates the TabletGroupWatcher to remove the servers being shutdown from being used for assignment. The change restores previous logic... — committed to apache/accumulo by dtspence a year ago
@cshannon The interrupted exception on line https://github.com/apache/accumulo/blob/b41427dc18c9fa36c9e619ebf928a02aebeb22fb/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java#LL987C33-L987C33 is also poorly formatted. It should be
log.error("{}", e, e);
instead oflog.error(e.toString());
Regardless of what is discovered from troubleshooting this, that logging should be fixed.
There may be cases where we omit the full stack trace for some rational reason, but those cases should have comments in the code explaining the justification.