You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched in the issues and found no similar issues.
Describe the feature
When task is killed for stage cancel, another task attempt succeed or some other reasons, The AddBlockEvent handling and sendShuffleData still work.
Although needCancelRequest may cancel some work, but the AddBlockEvent in the blocking queue of threadPool still holds the shuffleblockdata, and so as to the rpc request that are already called but waiting for repsonse.
That will cause 3 problems:
We freeAll memory onece the task is killed, but the shuffleBlockData hold by the async thread still occupy memory
Many useless runnable related to the kille task are still working or wait to be executed
CurrentlycheckBlockSendResult can not be interrupted, when the killed task caused by speculation is the last one of the shuffle map stage, it will block the next reduce stage scheduling
Motivation
No response
Describe the solution
Cancel all the runnable that are wait to be executed or blocked in waiting for rpc callback