the EarlyStopping callback not working well on multi worker distribute training job

# Current behavior
If there is only one worker ,training with  EarlyStopping callback is ok. When multi workers  with EarlyStopping callback  doing distribute training, all workers will be hanging and waiting for synchronizing.

![09D96DCB-F298-4941-8C85-CDB56A5C0ABB](https://user-images.githubusercontent.com/30410832/201819759-66ba191e-bd0a-4838-8852-eff13856cb96.png)


# Expected behavior
I want the EarlyStopping callback works well not  only on one worker task but also on  multi workers distribute training job.


# System information
- GPU model and memory:
- OS Platform:
- Docker version:
- GCC/CUDA/cuDNN version:
- Python/conda version:
- TensorFlow/PyTorch version:

# Code to reproduce

```python
```
....
callbacks_list.append(EarlyStopping(monitor="val_loss",
                                min_delta=self.ctx.min_delta,
                                patience=self.ctx.patience,
                                verbose=verbose,
                                mode="min",
                                baseline=None,
                                restore_best_weights=True)
            )

....

keras_model.fit(
    x=None,
    y=None,
    validation_data=valid_ds,
    steps_per_epoch=self.ctx.steps_per_epoch,
    validation_steps=self.ctx.valid_steps_per_epoch,
    epochs=self.ctx.callback_num,
    callbacks=callbacks_list,
    checkpoint_dir=self.ctx.model_save_path,
    keep_checkpoint_max=1,
    verbose=0)


# Willing to contribute

Yes



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

the EarlyStopping callback not working well on multi worker distribute training job #88

Current behavior

Expected behavior

System information

Code to reproduce

Willing to contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

the EarlyStopping callback not working well on multi worker distribute training job #88

Description

Current behavior

Expected behavior

System information

Code to reproduce

Willing to contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions