Skip to content

Forced tserver shutdown #6044

@ivakegg

Description

@ivakegg

The problem scenario we came across is where a node became inaccessible, however somehow tservers on that node were keeping the zookeeper locks alive. The master was still continually trying to contact that node for tablet assignments, various fate transactions (bulk loads, table deletes, compactions). All of the communications were timing out because sockets could not be established. The master got to the point where it was attempting to shutdown the tservers bug of course that was failing as well. After removing the node from the cluster.yaml and failing all of the fate transactions, the master still would not get past the issue. We finally has to issue an admin stop -f to force the lock to be removed and to get past the issue.

I would like the ability for the master to forcefully remove the zookeeper lock after a configurable number of attempts to stop the same tserver.

This was in accumulo 2.1.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementThis issue describes a new feature, improvement, or optimization.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions