-
Arnab Ray authored
Problem -------- When a data node is restarted, the node is first stopped and after a fixed wait, it is assumed to have entered the NOT_STARTED state. At this point, a start signal is fired. If the node is not ready i.e. has not completed stopping during the wait and is therefore not in the NOT_STARTED state, the start signal is silently ignored leaving no error even though the restart process is incomplete. Fix: ---- Check if the data node has reached the NOT_STARTED state before the start signal is fired. The wait for the node to reach this state is split into 2 separate checks: 1. Wait for the data node(s) to start shutting down 2. Wait for the data node(s) to complete shutting down and reach NOT_STARTED state In the event of either of these cases timing-out, the restart is considered to have failed and an error message is returned.
Arnab Ray authoredProblem -------- When a data node is restarted, the node is first stopped and after a fixed wait, it is assumed to have entered the NOT_STARTED state. At this point, a start signal is fired. If the node is not ready i.e. has not completed stopping during the wait and is therefore not in the NOT_STARTED state, the start signal is silently ignored leaving no error even though the restart process is incomplete. Fix: ---- Check if the data node has reached the NOT_STARTED state before the start signal is fired. The wait for the node to reach this state is split into 2 separate checks: 1. Wait for the data node(s) to start shutting down 2. Wait for the data node(s) to complete shutting down and reach NOT_STARTED state In the event of either of these cases timing-out, the restart is considered to have failed and an error message is returned.
Loading